Ill 11 II II III 'II III I'll 111 

@ Publication number: 0 461 127 B1 



© EUROPEAN PATENT SPECIFICATION 



© Date of publication of patent specification : (51) Int CI. 8 : G10L 5/00, G06F 17/28 

27.09.95 Bulletin 95/39 ^ 



@ Application number : 90903079.3 



<g) Date of filing : 26.01.90 



@ International application number : 
PCT/US90/00389 



(87) InternationaJ publication number: 
WO 90/09020 09.08.90 Gazette 90/19 



Europaisch s Paten tarn t 
@ fl))} European Patent Office 

Office europeen de brev ts 



(g) INTERACTIVE LANGUAGE LEARNING SYSTEM. 



(So) Priority : 02.02.89 US 305223 

(43) Date of publication of application : 
18.12.91 Bulletin 91/51 

@ Publication of the grant of the patent : 
27.09.95 Bulletin 95/39 

@ Designated Contracting States : 

AT BE CH DE DK ES FR GB IT LI LU NL SE 



(55) References cited : 
EP-A- 0 036 559 
EP-A- 0 087 725 
EP-A- 0 294 201 
EP-A- 0 294 202 
WO-A-90/05350 



CD 

CM 



(0 



CL 
LU 



@ References cited : 
DE-A- 3 700 796 
GB-A- 2 198 871 
US-A- 4 627 001 
US-A- 4 831 654 



© Proprietor: AMERICAN LANGUAGE 
ACADEMY 
11426 Rockville Pike 
Suite 200 

Rockville, MD 20852 (US) 



(72) Inventor: WILLETTS, John, A. 
1616-C Belmont Street 
Washington, D.C. 20009 (US) 

@ Representative : Smith, Norman Ian et al 
F.J. CLEVELAND & COMPANY 
40-43 Chancery Lane 
London WC2A 1JQ (GB) 



Note: Within nine months from the publication of the mention of the grant of th European patent, any 
p rson may give notice to the European Patent Office of opposition to the European patent granted. 
Notice of opposition shall b filed in a written reasoned statement It shall not b deemed to have been 
filed until the opposition fee has been paid (Art 99(1) European patent convention). 



Jouve, 18, me Saint-Denis, 75001 PARIS 



EP46 ] 3 127 



Page 2 of 6 



1 EPO 

De cripti n 

FIELD OF THE INVENTION 

The invention relates to computer systems with 
speech capabilities. More particularly, the invention 
pertains to a computerized interactive language 
learning system which provides visual text displays 
and associated digitized audio speech. 

BACKGROUND AND SUMMARY OF THE 
INVENTION 

As communications and high speed transporta- 
tion continue to make our world seem smaller, know- 
ing a second language becomes more important and 
valuable. Unfortunately, traditional language instruc- 
tion in the classroom by itself generally does not, due 
to time constraints, sufficiently immerse the student 
in the second language he or she is studying to en- 
sure rapid learning. 

While written materials (e.g., textbooks, work- 
books, and the like) provide some opportunity for the 
student to study by himself, written materials cannot 
effectively assist the student in pronunciation and 
other aural aspects of language learning. Although 
some written language study materials are accompa- 
nied by prerecorded audio tapes or records allowing 
the student to listen to the language being spoken, 
even these prerecorded audio materials have the dis- 
advantage that they cannot provide the student with 
feedback about his or her pronunciation. In the past, 
the only way to obtain effective spoken language 
drills and practice outside of the classroom environ- 
ment was to hire a language tutor (an expensive pro- 
position) or to spend time with someone who was al- 
ready fluent in the unfamiliar language. 

The concept of using computer hardware/soft- 
ware to provide synthesized or digitized spoken lan- 
guage is generally known. The following is a some- 
what representative (but by no means exhaustive) 
listing of prior publications, prior issued U.S patents, 
and published software packages relating to comput- 
er-assisted language learning with speech capabili- 
ties: 

U.S. Patent No. 4,579,533 to Anderson et ai; 

U.S. Patent No. 4,591,929 to Newsom; 

U.S. Patent No. 4,749,353 to Breedlove; 

U.S. Patent No. 4,695,962 to Goudie; 

U.S. Patent No. 4,710,877 to Ahmed; 

Brower, "Word Torture Eases Pain Of Lan- 
guage Learning 0 , 2 MacWEEK n.48, p. 14 (29 Nov. 
1988); 

Parham, "Computers That Talk", 8 Classroom 
Comput rL arning n. 6, pp. 26-36, 63 (March 1988); 

Jack, "Worte & Satze: A German Tutor For Kids 
Or Adults", 2 Color Comput r Magazine n.3, p. 20 
(May 1984); 
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Barbour, "Computerized Speech: Talking Its 
Way Into The Classroom", 6 El ctronic Learning, n.4, 
p. 15 (Jan 1987); 

PEAL SOFTWARE (Los Angeles, California), 
5 "Representational Play", "Keytalk", and "Exploratory 
Play" software packages; 

"E Z Pilot II Authoring System" software by 
Hartley Courseware, Inc., Dimondale, Michigan; 

"Smoothtalker Version 2.0" software by First 
10 Byte Inc.; 

"Experlogo-Talker/Prologo" software by Ex- 
perintelligence, Inc.; 

"Voice Master Version 4.0" system by Coyox 

Inc. 

15 "Basic Language Series - Spatial Concepts" 

by Science Research Association; 

Talking Text Writer" and "Talking Text Speller" 
software published by Scholastic Inc., Jefferson City, 
Missouri; 

20 "Reading Skills Development Program" soft- 

ware available from American Educational Computer, 
Inc., Oklahoma City, Oklahoma; 

"Writing To Read" by International Business 
Machines; 

25 "Language Experience" software series from 

Teacher Support Software, Gainesville, Florida; and 
Houghton Mifflin's "Listen and Learn" series, 
Houghton Mifflin Educational Software Division, 
Hanover, New Hampshire. 
30 Additional patents generally relating to learning 
aids with speech synthesizers include: 

U.S. Patent No. 4,769,846 to Simmons; 
U.S. Patent No. 4,403,965 to Hawkins; 
U.S. Patent No. 4,421,487 to Laughon et al; 
35 U.S. Patent No. 4,457,71 9 to Dittakavi et al; 

and 

U.S. Patent No. 4,549,867 to Dittakavi. 
The Anderson et al '533 patent cited above dis- 
closes a microprocessor based electronic teaching 

40 aid which enables the student viewing a display to 
designate any word or portion of text for vocalization 
by synthesized speech techniques. The "reading" ma- 
terial provided by the system is stored in a preprog- 
rammed (f ixed) source. Read only memory. Pointers 

45 are used to point to the start addresses for the words. 
Mass storage devices are avoided in favor of semi- 
conductor ROM memory. Speech data is stored in the 
memory as individual words in a dictionary. No facility 
for inputting digitized student utterances into the sys- 

50 tern is provided. 

U.S. Patent No. 4,591,929 to Newsom teaches a 
second language learning system connected to a 
magnetic tape recorder. An electronic interface con- 
trols the tap r corder functions. Th last phrase 

55 played back by the tape r corder is converted into dig- 
ital form and stored in an electronic store to permit th 
student to reproduce the phrase as many times as de- 
sired without having to rewind the tap . The student 
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can also record his own voicing of a phrase in a dif- 
ferent portion of the electronic stor and can then se- 
lectively reproduce the teaching phrase or his re- 
sponse - re-recording his voicing until satisfied. 

U.S. Patent No. 4,71 0,877 to Ahmed discloses a s 
computer-based language learning system including 
a speech synthesis capability using linear predictive 
coding. A menu driven student interface is used to 
step a student through preprogrammed lessons fea- 
turing visual and synthesized speech stimulae. w 

U.S. Patent No. 4,695,962 to Goudie teaches a 
system which attempts to increase the naturalness of 
synthesized speech produced from linear predictive 
encoded speech data by substituting different data 
depending upon whether words are reproduced in iso- is 
lation in a word mode or together with other words in 
a phrase mode. 

The Breedlove '353 patent discloses a hand-held 
microprocessor based system that converts student 
utterances into digital form and allows the student to 20 
store the digitized utterances in memory associated 
with student inputted text such as correct word spell- 
ing. 

The "Word Torture" software program referenced 
above is another example of a computer-assisted Ian- 25 
guage learning system. This program, published by 
Hyperglot Software Co. of Knoxville, TN, is designed 
to run on an Apple Macintosh personal computer 
equipped with a "HyperCard" programmable database 
which supports digitized and synthesized sound. For- 30 
eign language study stacks provide automated vo- 
cabulary drills that work from English to a foreign lan- 
guage or vice versa, and permit users to adjust inter- 
val times and add new words. The system also pro- 
vides digitized pronunciations of foreign language al- 35 
phabets. 

Other systems (including the Scholastic Soft- 
ware Talking Text Writer" program) are essentially 
talking word processors with speech synthesis capa- 
bilities to allow students to hear whatever is typed 40 
and well as hear text entered by the teacher. 

However, as observed by Parkham in his survey 
article "Computers That Talk" discussed above, lan- 
guage arts system developers have in the past had 
great difficulties providing acceptable, useful sys- 45 
terns. Known text-to-speech synthesis algorithms are 
capable of converting written text into synthesized 
spoken words by referencing prestored "phonemes" 
(sets of sounds). The "Smoothtalker", "Experlogo- 
Talker" and "Talking Text Writer" systems referenced so 
above are examples of systems which use text-to- 
speech synthesis. While text-to-speech synthesis 
may be acceptable for talking word processors, user 
interfaces, or the like, known algorithms cannot pro- 
duce the rang ofinfl ctions (stress and intonations) 55 
and pronunciations required for language learning. 

Th digitized speech approach (i. ., in which ac- 
tual human speech is converted to digital signals us- 
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ing digitizing hardware for later reproduction) is capa- 
ble of producing speech as realistic as r corded voice 

- in any language and including accent and inflec- 
tion. How ver,th use of digitized speech is extreme- 
ly memory intensive (a limitation which has proven to 
be a major roadblock in its use in the past). A single 
second of digitized speech can occupy 64 Kbytes of 
storage space (somewhat less if compression algo- 
rithms are used). To reduce the amount of memory re- 
quired, some system developers have used methods 
for reusing words by encoding and storing individual 
words and phrases individually. This has, however, 
been a problematic approach for language learning in 
the past - since it has been shown that students 
learn best when presented with words in natural con- 
text (and the same word or phrase is often pro- 
nounced differently depending upon context - see 
the Goudie '962 patent referenced above). 

Most prior digitized speech systems have been 
limited to playing back prestored digitized speech. 
However, some prior systems also permitted the stu- 
dent to digitize his own speech for later play back. For 
example, Covox, Inc. claims its "Voice Master" 
speech synthesis system supposedly speaks in the 
user's own voice, in any language, and with any ac- 
cent To record speech, a "learn" command is input- 
ted and the student speaks into a microphone. To play 
back the recorded speech, the student inputs the 
"speak" command. Up to 64 different words, phrases 
or other sounds can be in memory at any one time - 

- with additional words being stored on disk and load- 
ed as needed. 

See also U.S. Patent No. 4,591,929 to Newsom 
discussed above, which teaches: (a) digitizing a spok- 
en phrase spoken by the user and storing the digitized 
user's phrase in an electronic store along with a digi- 
tized teaching phrase (played back from a tape recor- 
der); (b) and permitting the user to selectively repro- 
duce the teaching phrase or his own response. How- 
ever, Newsom provides only minimal digitized speech 
storage (e.g., a single teaching phrase) and requires 
the student to control the functions of a tape recorder 
in order to select a different teaching phrase. The 
process of rewinding/fast forwarding a tape recorder 
is extremely cumbersome. Moreover, Newsom pro- 
vides no facility for integrating textual material, graph- 
ical or other display, or other study aids with his strict- 
ly orla lesson. 

DE-A-3700796 describes a voice trainer which 
makes use of a display for displaying speech graphi- 
cally in the form of intonation curves and frequency 
diagrams. 

Hence, although much prior work has been done 
in the area of computer-assist d language I am ing, 
there is room for much further improv ment 

For example, no on is th past has successfully 
developed a truly interactive computer assisted lan- 
guag learning system which integrates visual dis- 
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plays with preprogrammed digitised speech and 
which also interactiv ly digitizer student speech and 
permits the student to easily list n to his own pronun- 
ciation and compare it with the digitised pronunciation 
of a model word or phrase he selects. Significantly, 
the present invention may provide the very first rruly 
interactive computer assisted language learning sys- 
tem which allows a student to select a model phrase 
from text displayed on an electronic display; record 
(in digitised form) his own pronunciation of that 
phrase; and instantly listen to the digitised vocal ver- 
sion of the selected phrase and his own recorded pro- 
nunciation for comparison purposes. 

According to one aspect of the present invention 
there is provided an interactive language learning 
system comprising 

storing means for storing in digital form data 
representative of a model of speech version of a pas- 
sage of language and for storing data representative 
of user input speech, 

a display for displayinG visual information cor- 
responding corresponding to the passage, and 

selecting means operatively connected to said 
display and to said storing means and operative to 
provide a comparison of the user input speech with 
the stored model version of the passage of language, 

characterised in that 

said storing means is arranged to store a digi- 
tised speech version of the passage of language and 
also to store a digital data text version of said same 
passage, 

said selecting means is operable by a user to 
select a portion of said passage and to cause text cor- 
responding to said selected passage portion to be dis- 
played on said display based on said stored digital 
data text version, and in that the system includes 
speech processing means for. 

selecting the portion of said stored digitised 
model speech version corresponding to said selected 
portion of said passage, 

converting said selected digitised model 
speech version portion to audio signals for use in gen- 
erating speech sounds, 

converting audio signals representing user in- 
put speech into digitised signals representing said 
user speech input and 

subsequently reconverting said digitised 
speech signals representing said user input speech to 
audio signals. 

Many other significant advantageous features 
are provided embodiments of the present invention, 
including the following: 

SoundSort - A text reconstruction exercise 
based on aural clues. In accordance with this featur 
of the invention, the system automatically randomiz- 
es the order of plural phrases, provides digitized ut- 
terances of the phrases in the randomized order, and 
requires the student to reconstruct the original order 



using a visual display interface. 

• An audio CLIP mode which permits the student 
t select any (random) portion of display dtext 
(e.g., a phrase, a small part of a phrase, a sin- 

5 gle word, a syllable, or a phoneme) using cur- 

sor controls and to control the system to play 
the digitized speech corresponding to that se- 
lected portion. This feature allows the student 
to concentrate on difficult phrases. 

10 • Integration of digitized sound in a high-level au- 
thoring system (as distinct from an authoring 
language) is provided. An easy-to-use "WYSI- 
WYG" ("What you see is what you get") user in- 
terface reduces or eliminates mistakes and as- 

15 sociated frustration and does not require the 

user to have any programming ability. 

• An extremely flexible authoring system allows 
a teacher to link recorded digitized speech with 
customized on-screen text (which may but 

20 need not match the digitized speech). This al- 

lows a wide variety of free-form exercises to be 
created. 

• The system permits the student to hear his own 
speech and the correct (model) speech, each 

25 at a keystroke, with no delay. 

• Teacher-composed customized help screens 
and instructions can be referred to by the stu- 
dent upon depressing a single keystroke. This 
feature permits great increases in the number 

30 of possible teacher-created lesson formats and 

also provides great flexibility in customization 
and ease of use not provided in other systems. 

• Despite the fact that digitized speech is em- 
ployed, interrupt driven hardware in conjunc- 

35 tion with software operating in the background 

permits essentially continuous replay of digi- 
tized audio data stored on a mass storage de- 
vice - without pauses due to loading and re- 
loading of memory (for up to 23 hours of con- 
40 tinuous speech from a CD ROM mass storage 

device for example). 
The presently preferred exemplary embodiment 
of the invention provides a system including several 
functional modules which are implemented in hard- 
45 ware, software or both. A digital speech processor 
connected to a conventional personal computer is 
used to convert digitized speech data to audio signals 
and vice versa under control of a memory resident in- 
terrupt driven software module (this module handles 
so all play and record requests for the speech proces- 
sor). A pu bl ic doma in RAMdisk driver set s aside mem- 
ory for use as a simulated (virtual) disk drive. In the 
preferred embodiment, all recorded speech is placed 
on th virtual disk first, then copied to other mass 
55 storage d vices (e.g., floppy disk). 

The personal comput r processor executes pro- 
gram control steps in th preferred embodim nt 
which provide a wide vari ty of useful functions. 



7 



EP 0 461 127 B1 



8 



Thes functions may be divid d into "t acher" func- 
tions (used to create and compose I ssonsand xer- 
cises); and "student" functions (p rfbrmed by the stu- 
dent for learning purposes). The student functions 
generally operate on lessons and exercises previous- 
ly created by the teacher using the teacher functions. 

One of the teacher functions is a "Text Writer" 
word processor permitting the teacher to compose 
texts. A lesson authoring utility is then used to record 
segments of sound (phrases) which are linked to 
phrases in on-screen text(s) composed with the word 
processor. The teacher may also select a second 
(page two) textual display format to be presented as 
instructions or help to the student After recording the 
phrases, the teacher selects which of three student 
functions will be used with the newly created lesson. 
The teacher may, therefore, create texts and exercis- 
es appropriate to any of the three functions. 

Three student functions are provided in the pre- 
ferred embodiment (a) AudioLab (which provides au- 
ral and oral practice and learning); (b) SoundSort (an 
aural text reconstruction exercise); and (c) Audio- 
Write (a writing exercise focusing on listening com- 
prehension). 

The AudioLab student function in the preferred 
embodiment provides three modes: (i) PREVIEW, (ii) 
LAB, and (iii) CLIP. 

In the PREVIEW mode, the student can listen to 
an entire prerecorded lesson with the option to view 
the corresponding complete text on the personal 
computer display screen. Thus, the student hears the 
digitized model speech of a lesson and can also view 
the displayed corresponding text (generally the text of 
the speech) as an audio-visual lesson. 

In the LAB mode, the student can select individ- 
ual phrases from the recording. The student may also 
view the complete text on the display - or only the text 
corresponding to a phrase selected by the student 
The student can also record himself speaking any in- 
dividual phrase of his choosing, and play back his own 
speech and the corresponding preprogrammed mod- 
el digitized speech so as to compare the two. 

In the CLIP mode, the student can work with any 
selected portion of the current phrase (down to 0.1 
seconds long in the preferred embodiment). The stu- 
dent can play the entire original phrase or only a por- 
tion of the phrase he selects; record himself speak- 
ing; and compare his played back speech to the orig- 
inal. Moreover, the student can examine phrases in 
three different ways in the preferred embodiment for- 
wards (e.g., "This/is/an/el/e/phant"); backwards (e.g., 
"phant" - "e/phant" - "el/e/phant"); or middle (e.g., 
"is/an"). 

The SoundSort function provid s a computer 
puzzle exercise which randomizes (jumbles) the or- 
der of phrases in a lesson text A column of symbols 
is displayed representing the phrases in the lesson 
text The student must restor the symbols into the 



correct ord r by moving the symbols around the dis- 
play screen (using interactive cursor controls and th 
like). The only clues provided by the pref rr d em- 
bodiment as to the correct order of the phrases are 

5 aural versions of the phrases obtained by listening to 
selected phrases (as many times as the student de- 
sires) and by listening to the complete, original les- 
son. The text is not shown on the screen in the pre- 
ferred embodiment - requiring the student to listen to 

10 the phrases and reorder them into the proper context 
The AudioWrite function of the preferred embodi- 
ment provides the digitized speech lesson one phrase 
at a time, and requires the student to type or recon- 
struct what he hears (with complete freedom of cor- 

15 rection and repetition). The phrase typed in by the 
student is then compared to the original text, and any 
differences are flagged as errors. Punctuation, spac- 
ing and capitalization are provided by the system in 
the preferred embodiment and are thus not tested. 

20 Thus, the highly integrated speech and visuals 
provided by the present invention permits a student 
to: 

see, hear, record and compare complete text 
or dialogue, phrase by phrase (or by selected portions 
25 of phrase); 

practice listening comprehension; and 
instantly, randomly access any part of a re- 
corded selection. 

The system also provides teachers with an easy-to- 
30 use utility for creating an infinite variety of exercises. 

BRIEF DESCRIPTION OF THE DRAWINGS 

These and other features and advantages of the 
35 present invention will be better and more completely 
understood by studying the following detailed de- 
scription of a presently preferred exemplary embodi- 
ment in conjunction with the appended sheets of 
drawings, of which: 
40 FIGURE 1 is a schematic block diagram of a pre- 

sently preferred exemplary embodiment of an in- 
teractive language learning system in accor- 
dance with the present invention; 
FIGURE 2 is a high level schematic flow chart- 
45 type description of the options presented to a stu- 
dent by the system shown in FIGURE 1; 
FIGURES 3A-5 are graphical flow illustrations of 
the options shown in FIGURE 2; 
FIGURES 6A-6B are together a flow chart of ex- 
50 emplary program control steps performed by the 
FIGURE 1 system to provide the options shown 
in FIGURE 2; 

FIGURES 7A-7D are together a flow chart of ex- 
emplary program control steps relating to the Au- 
55 dioLab routine function shown in FIGURE 6; 

FIGURES 8A-8B ar together a schematic flow 
chart of exemplary program control steps related 
to the AudioWrite routine (function) shown in FIG- 
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URE 6; 

FIGURES 9A-9C are together a schematic flow 
chart of xemplary program control steps per- 
formed by the FIGURE 1 system upon execution 
of the SoundSort routine (function) shown in FIG- 
URE 6; 

FIGURE 10 Is a high-level schematic flow chart- 
type diagram of the options presented to a teach- 
er by the FIGURE 1 system to permit the teacher 
to create lessons; 

FIGURES 11A-11E are together a schematic flow 
chart of exemplary program control steps per- 
formed by the FIGURE 1 system to permit a 
teacher to create lessons; 
FIGURE 12 is a flow chart of exemplary program 
control steps performed by the "Select File" rou- 
tine shown in FIGURE 11 A; 
FIGURE 13 is a schematic flow chart of exem- 
plary program control steps performed by the 
"Choose Drive" routine shown in FIGURE 12; 
FIGURE 14 is a schematic flow chart of exem- 
plary program control steps performed by the 
"DIR MENU 0 routine shown in FIGURE 12; and 
FIGURE 15A-15B are together a schematic flow 
chart of exemplary program control steps per- 
formed by the FIGURE 1 system to execute the 
"FILE MENU" routine shown in FIGURE 12. 

DETAILED DESCRIPTION OF PRESENTLY 
PREFERRED EXEMPLARY EMBODIMENTS 

FIGURE 1 is a schematic block diagram of a pre- 
sently preferred exemplary embodiment of an inter- 
active language learning system 50 in accordance 
with the present invention. In the preferred embodi- 
ment, system 50 includes a conventional personal 
computer 52 (e.g., IBM PC or true compatible provid- 
ed with a conventional DOS disk operating system 
version 2.1 or higher and at least 384 kilobytes of ran- 
dom access memory); a keyboard input device 54; a 
mass storage device 56 (which may be one or more 
floppy diskette drives and associated floppy disk- 
ettes, Winchester-type hard disk drives and/or CD 
ROM drives); a conventional CRT-type display 58; 
and a speech processor 60 connected to an appropri- 
ate audio input/output device (a conventional head- 
set-type speaker/microphone arrangement 62a 
and/or a microphone/loudspeaker combination 62b 
with appropriate external audio amplifiers as neces- 
sary). 

In the preferred embodiment, speech processor 
60 is a modified conventional model VP625 PC-com- 
patible digital processor board manufactured by 
ANTEX Electronics of Gardenia, California. This con- 
v ntional speech processor, which is described in 
readily availabl ANTEX El ctronics published spec- 
ifications and can be purchased directly from the 
manufacturer, plugs directly into a so-called expan- 



sion slot of personal computer 52, and makes avail- 
able on the personal computer rear panel an audio in- 
put/output socket Speech proc ssor 60 converts au- 
dio signals applied to its audio input into ADPCM 

5 (Adaptive Differential Pulse Code Modulation) en- 
coded digital data in a conventional manner for stor- 
age on mass storage device 56 - and also converts 
previously recorded ADPCM encoded digital data 
stored on the mass storage device into an audio sig- 

10 nals provided at the speech processor audio output 
socket (also in a conventional manner). 

Speech processor 60 samples the audio wave- 
form presented at its audio input (e.g., from the micro- 
phone of headset 62a or from separate microphone 

15 62b) at either 8, 1 2 or 1 6 kHz using ADPCM encoding 
technology. At the 16 kHz sampling rate, full fidelity 
sound is produced with a frequency response of 20 
Hz to 7.0 kHz. The use of ADPCM provides a data re- 
duction of better than 2-to-1 over other standard digi- 

20 tization techniques. 

Speech processor 60 of the preferred embodi- 
ment operates in background under interrupt control 
of the conventional DOS disk operating system and 
the associated microprocessor internal to personal 

25 computer 52 - using any one of several programma- 
ble interrupt and I/O addresses. Speech processor 60 
also provides software programmable volume con- 
trols on both audio input and output and a software- 
addressable level detector to provide an indication of 

30 signal amplitude during record/playback. 

In the preferred embodiment, speech processor 
board 60 is modified so that headset 62a can be con- 
nected directly to it using a DB-9 headset connector 
and is also provided with a program controlled volume 

35 level and dual input capability (to support both the as- 
tatic microphone of an AKG K-1 8 headset and an ex- 
ternal 5V signal). A microphone preamplifier stage is 
also included to provide an increased signal-to- noise 
ratio. 

40 Mass storage device 56 in the preferred embodi- 

ment stores three types of digital signal information: 
a) digitized speech information; b) text information 
associated with the speech information; and c) pro- 
gram control instructions (which control the proces- 

45 sor and other associated components within personal 
computer 52 to perform the interactive language 
learning functions provided by the present invention). 
Keyboard 54 is used to permit the user (student or 
teacher) to interact with the execution of the program 

so control steps, while display 58 permits the user to 
view graphics, text and other visually-presented infor- 
mation. 

STUDENT FUNCTIONS 

55 

FIGURE2isahigh-l v I flow chart-type diagram 
of the options presented to a student by system 50, 
and FIGURES 3A-5 ar graphical illustrations of 
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these options. The options provided by th preferred 
mbodiment to the student in ffect constitute an au- 
dio visual user interface with which the student may 
interact in order to learn a second language. 

Upon starting system 50 (e.g., by powering on 
personal computer 52 and its associated peripherals 
and controlling the personal computer to begin exe- 
cuting program instructions stored in mass storage 
device 56, FIGURE 2, block 100), a display title 
screen is displayed on display 58 (block 102) and sys- 
tem 50 then prompts the student for an audio disk 
(block 104). Typically, lessons are stored on floppy 
diskettes so that the student may easily change les- 
sons by simply inserting another diskette into person- 
al computer 52. System 50 displays the names of the 
lessons stored on the audio disk to permit the student 
to change lessons if he desires. A main menu is then 
displayed on display 58 which permits the student to 
select between five different options in the preferred 
embodiment 1) AudioLab; 2) SoundSort; 3) Audio- 
Write; 4) Change Audio Disk; and 5) Exit The student 
uses up arrow and down arrow cursor control keys in 
the preferred embodiment to select one of the five op- 
tions, and then depresses the enter key to cause that 
option to be executed. 

The exit option causes the interactive language 
learning functions provided by the preferred embodi- 
ment to terminate execution. The option to change au- 
dio disk causes system 50 to prompt for a new audio 
disk (block 104). The AudioLab, SoundSort and Au- 
dioWrite options perform interactive language learn- 
ing functions that will now be explained. 

The AudioLab function provides the student with 
practice in pronounciation and listening comprehen- 
sion. In the preferred embodiment, the AudioLab op- 
tion or function has three different modes: PREVIEW 
(block 108); LAB (block 110); and CUP (block 112). In 
the PREVIEW mode, the student listens to the entire 
selected text and also sets the playback volume for all 
of the routines (AudioLab, SoundSort and Audio- 
Write). In the LAB mode (block 110), the student lis- 
tens to phrases from the text and may also record his 
own speech and may compare his played back voice 
to the original. In both the PREVIEW and LAB mode 
the student may choose to see the phrases and text 
in different combinations again on display 58 or he 
can choose to listen without viewing the text From the 
LAB mode, student may select the CLIP mode (block 
112). In the CLIP mode, the student may choose to 
work on any selected portion of a phrase to permit him 
to practice difficult sounds. 

Upon selecting the AudioLab option from main 
display 106 in the preferred embodiment, system 60 
begins performing the AudioLab PREVIEW mode 
(block 108). FIGURE 3A is a graphical description of 
some of the options present d by th preferred em- 
bodiment in the PREVIEW mod . The student may 
adjust volum level by depressing the I ft arrow key 



(to decrease volum level) or the right arrow key (to 
increase volume level) - and this volume level adjust- 
ment remains in effect for all functions (programs) 
provided by system 50. To begin playing a prerecord- 

5 ed lesson, the student depresses the F2 function on 
keyboard 54 on the preferred embodiment In the pre- 
ferred embodiment, this causes system 50 to begin 
producing audio in headset 62a by controlling speech 
processor 60 to convert digitized speech stored on 

10 mass storage device 56 into audio signals. 

In the preferred embodiment, one or more 
screens of text may be associated with a particular 
block of stored digitized speech, and in the PREVIEW 
mode this text may be displayed by display 58 while 

15 system 50 produces the converted audio signals from 
the digitized speech. This associated text is typically 
actual text corresponding to the speech being repro- 
duced (since the student may then "read along" with 
the digitized speech being played back), but it may 

20 have some other contents - depending upon what 
the teacher desires (as will be explained). To stop the 
speech (and text) generation, the student may de- 
press the ESC (escape) key of keyboard 54. To re- 
sume speech/text reproduction, the student may de- 

25 press the F2 key again. Depressing the ESC key an- 
other time returns the student to the main menu (FIG- 
URE 2, block 106). Depressing the ENTER key caus- 
es the LAB mode to be entered (FIGURE 2, block 
110). On-line help is available by depressing the F1 

30 key in the preferred embodiment 

FIGURE 3B is a graphical illustration of options 
available to the student in the LAB mode of the Au- 
dioLab function (FIGURE 2, block 110). In this LAB 
mode, the student can select different phrases (e.g., 

35 sentences) to listen to in isolation one or more times. 
If the student wishes to concentrate on a specific 
phrase, he selects the LAB mode by depressing the 
ENTER key. Once in the LAB mode, the student may 
select the phrase he wishes to concentrate on using 

40 cursor control keys. If in the PREVIEW mode the stu- 
dent has his text turned off (this is the default mode), 
then in the LAB mode only the selected phrase will be 
displayed by display 58. If, on the other hand, the stu- 
dent in the PREVIEW mode selected that the text 

45 should be displayed (by depressing F7), the full text 
is displayed on display 58 but the selected phrase is 
highlighted. The left arrow and the right arrow keys in 
the preferred embodiment move the display "work 
box" to different phrases, and the F6 key is used to 

so turn a phrase on and off - thereby selecting the 
phrase to be treated using the LAB mode. 

Once the student has selected a phrase, he can 
depress the F2 key to control speech processor 60 to 
play back the digitiz d spe ch corresponding to that 

55 phrase. By depressing th F3 key, th student may 
record his own pronunciation of the same phrase. 
Once the F3 key is depress d, the student is prompt- 
ed to depress the space key to begin recording (and 



13 



EP 0 461 127 B1 



14 



may adjust the recording level using the left arrow and 
right arrow keys). Upon depressing the space bar, 
speech processor 60 b gins converting audio applied 
at its audio input (e.g., from the microphone in head- 
set 62a) into digitized speech information and storing 
the digitized speech (on a virtual disk). When the stu- 
dent is through speaking, he depresses the space bar 
again to stop recording. The student may then de- 
press the F4 key to instantly play back his own just- 
recorded speech - or depress the F2 key to listen 
again to the model pronunciation of the selected 
phrase. Depressing the ESC key returns system 50 
to the PREVIEW mode (FIGURE 2, block 108), while 
depressing the F5 key controls system 52 to perform 
the AudioLab CLIP mode (FIGURE 2, block 112). 

The CLIP mode in the preferred embodiment al- 
lows the student to analyze any section or part of a 
phrase selected in the LAB mode. A graphical illustra- 
tion of options presented to the student in the CUP 
mode are shown in FIGURE 3C. In the preferred em- 
bodiment, the CLIP mode permits the student to ex- 
amine any section or part of the model digitized 
speech recording down to a single phoneme (0.1 sec- 
onds in duration). In addition, the complete phrase 
can be heard by depressing a control key (e.g., F2). 

In the CLIP mode, the cursor control keys (up ar- 
row, down arrow, left arrow, right arrow) are used to 
select which part of the current phrase is to be played 
back. A graphical illustration of the length of the cur- 
rently selected portion of the phrase is displayed at 
the bottom of display. 58 in the preferred embodiment 
In the preferred embodiment, this graphical illustra- 
tion includes a horizontal line having a length propor- 
tional to the length of the selected phrase portion. The 
length and position of this horizontal line change in re- 
sponse to cursor controls to change the length and 
position of the selected portion relative to the current 
phrase. 

Once the student has selected the portion of the 
phrase he wishes to concentrate on, he depresses 
the F5 key to listen to the selected portion. As in the 
LAB mode, the student may record and play back his 
own voice using the F3 and F4 keys, and alternate 
playback of his voice with playback of the selected 
clip by toggling the F5 and F4 keys. Depressing the 
F8 key resets the clip to allow the student to select a 
different part of the phrase. Depressing the ESC key 
returns system 50 to the LAB mode (FIGURE 2, block 
110). 

The AudioWrite function (FIGURE 2, block 114) 
provides the student with an exercise in listening and 
writing by requiring the student to type phrases he 
hears. The student may listen to each phrase as many 
times as he wishes, and may also list n to th entire 
text before concentrating on each phrase (since in the 
preferred embodiment the typing ex rcise operates 
on a phrase-by-phrase basis). Once the student has 
typed a particular phrase, he can depress the ENTER 



k y to control system 50 to compare th text typ d in 
by the student with model text and indicate any errors 
in the student-generated t xt 

FIGURE 4 is a graphical illustration of the options 

5 available to the student in the AudioWrite function 
(FIGURE 2, block 114). Upon initiating the AudioWrite 
function, the student may depress the F2 key to listen 
to the entire text, or simply depress the ENTER key 
to start the exercise without listening to the whole text 

10 Once the exercise has begun, system 50 controls 
speech processor 60 to produce the audio corre- 
sponding to the first phrase of an exercise without dis- 
playing the corresponding text on display 58. The stu- 
dent may depress the F3 key to control speech proo- 
fs essor 60 to replay the spoken phrase (the phrase may 
be replayed as many times as the student wishes). 
The student then enters text by depressing the keys 
on keyboard 54 (in the preferred embodiment, system 
50 adds spaces, punctuation and capitalization so the 

20 student may concentrate on spelling and grammar). 
The student may depress the ENTER key at any time 
to check his progress. Upon depressing the ENTER 
key, system 50 compares the text keyed in by the stu- 
dent with the text version of the phrase being spoken 

25 by speech processor 60 -- and highlights any portions 
of the student-typed text which do not correspond to 
the model text The student may use his cursor con- 
trol keys to move the cursor to the erroneous charac- 
ters and correct his mistakes by retyping over the in- 

30 correct characters already there. The student may 
depress the F9 key at any time to control system 50 
to display the correct word corresponding to the stu- 
dent's inputted word the cursor points to. The dis- 
played model word disappears when any other key is 

35 depressed. When the student has correctly entered a 
phrase, he depresses the enter key to hear the next 
phrase. When the student finishes the exercise, he 
may depress the F2 key to listen to the entire text, or 
depress the ESC key to return to the main menu (FIG- 

40 URE 2, block 106). 

The SoundSort function provided by the prefer- 
red embodiment of the present invention presents the 
the student with an audio puzzle to solve. The "pieces 
of the puzzle" are the phrases in the model text - but 

45 jumbled in a randomized order. The student must put 
the phrases back in the correct order based only on 
aural clues. In the preferred embodiment, each 
phrase is identified by a symbol (e.g., the letters A, 
B, C) displayed on display 58. The student controls 

so system 50 to move the letters from a "jumbled" or- 
dered listdisplayed on the left-hand column of display 
58 to a correctly ordered list in the right-hand display 
column based on context of the associated phrases. 
Thus, system 50's SoundSort function associates a 

55 randomly ordered sequence of phrases with dis- 
played symbols, and then requires the stud nt to re- 
order the display d symbols to correspond to the cor- 
r ct order of the phras sequence. Th student may 
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listen t th phrases as many times as h wishes, but 
system 50 does not display the t xt associated with 
the phrases - only the symbols associated with the 
phrases. 

FIGURE 5 is a graphical illustration of the Sound- 
Sort function depicted at FIGURE 2, block 16. Upon 
initiating the SoundSort function, display 58 displays 
a vertical column of symbols (e.g., A, B, C, D) - each 
one symbolizing a phrase which is a portion of a sen- 
tence or passage. The student may depress the F2 
key to listen to the entire text in the correct order - or 
he may use his up arrow and down arrow cursor con- 
trol keys to highlight one of the displayed symbols. 
One symbol is always highlighted in the preferred em- 
bodiment Depressing the F3 key causes digitized 
speech processor 60 to reproduce the speech corre- 
sponding to the highlighted phrase. The student may 
then use his right arrow cursor control key to move the 
symbol corresponding to the phrase he has just heard 
to the center column — and may use the up arrow and 
down arrow cursor control keys to move the symbol 
up and down the center column - and then use the 
right arrow key to "park" the symbol in a desired pos- 
ition in the right-hand column. The object of the exer- 
cise is to move all of the symbols from the left-hand 
column to the right-hand column - and to rearrange 
the order of the symbols so that their rearranged order 
corresponds to the correct order of the phrases. The 
space bar may be depressed to change columns 
(e.g., from the left-hand column to the right-hand col- 
umn or vice versa). By depressing the ENTER key, 
the student is provided with an indication of his pro- 
gress -- since system 50 will highlight those symbols 
moved to the right-hand column that are in the correct 
order. The student may depress the F2 key at any 
time to listen to the entire text - and use the left arrow 
and right arrow cursor control keys to select the start- 
ing point of the text to be reproduced in audio form. 
This feature is especially useful when a long passage 
or string of sentences is being operated upon (since 
the student may, for example, wish to concentrate 
only on the last half of the passage and may not there- 
fore wish to listen to the entire passage from begin- 
ning to end). To exit the SoundSort function, the stu- 
dent may depress the ESC key to return to the main 
menu (FIGURE 2, block 106). 

Now that the overall student interface provided 
by system 50 has been described, a detailed descrip- 
tion of exemplary program control steps performed 
by personal computer 52 under software control in 
the preferred embodiment to provide that student in- 
terface will be presented in connection with FIG- 
URES 6-9C. 

FIGURES 6A-6B ar together a schematic flow 
diagram of an exemplary program control main rou- 
tine performed by th pref rred embodiment system 
50. Upon starting system 50, as described previously, 
a title is f irst displayed (block 150), and then the sys- 



t m prompts the student for an audi disk (block 152) 
and waits for the student to depress a key (block 1 54). 
System 50 then determines whether the floppy disk- 
ette and the floppy diskette drive contains correctly 

5 formatted lesson data (decision block 156). If the 
diskette being tested is not appropriately formatted, 
a warning message is displayed on display 58 (block 
158), system 50 waits for the student to depress an- 
other key (block 1 60), and then rechecks the diskette 

w contents (block 156). 

Once an appropriate diskette is inserted into the 
floppy diskette drive, system 50 reads a title of the 
lesson (and a page two flag) stored on the diskette 
and displays that title on display 58 (block 1 62). If the 

15 student wishes to choose another lesson, he de- 
presses an appropriate key (decoded by blocks 164, 
166) which cause system 50 to repeat blocks 152- 
166. If the student is satisfied with the lesson on the 
current diskette, he depresses, for example, the N 

20 key (decoded by blocks 1 64, 1 66), wh ich controls sys- 
tem 50 to read the contents of the current diskette 
(block 168). System 50 then displays a main menu 
display format (block 170) and waits for the student 
to select one of the five options described previously 

25 (decode block 1 72). The student may select execu- 
tion of the AudioLab routine (block 174, shown in 
greater detail in FIGURES 7A-7D), the AudioWrite 
routine 176 (shown in greater detail in FIGURES 8A- 
8B), or the SoundSort routine (shown in greater detail 

30 in FIGURES 9A-9C) using cursor controls and the 
ENTER key as described previously. 

Referring now to FIGURE 7A, the Audiolab rou- 
tine in the preferred embodiment first queries an in- 
ternal flag to determine whether this is the first time 

35 the student has used AudioLab function in this ses- 
sion -- and if so, displays an introductory screen 
(blocks 180-186). System 50 then awaits depression 
of a key (block 188), and decodes that key (block 190) 
to determine what operation to perform next Depres- 

40 sion of the ESC key causes a return to the FIGURE 
6A-6B main routine (block 192). If the student de- 
presses the F1 key, the currently displayed screen is 
saved, and a help screen is displayed in its place 
(block 194). Upon depression of a further key (block 

45 1 96), the saved work screen is returned to the display 
(block 198). 

If the student depresses the F9 key, system 50 
determines whether there is a "page two" associated 
with the currently displayed text (e.g., by checking a 

50 page two flag that is set when a page two exists -- de- 
cision block 200). In the preferred embodiment, a 
teacher generates a lesson or exercise by recording 
a spoken passage; inputting a main screen of infor- 
mation corresponding to that passage (this main 

55 screen is typically the t xtual version of the spoken 
passag , but may be any t xt the teacher wishes), 
and may also key in a "page two" screen providing 
supplementary text associated with th spoken 
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phras . Thus, some lessons may hav a page two 
screen associated with them and others may not In 
the preferred embodiment, page two text format is 
stored in files separately from the page one display 
formats - and the existence of the page two file is 
flagged in a file called n f ile.daf. If a page two display 
format does exist, it is displayed in a manner similar 
to the way the help screen is displayed (blocks 202, 
204, 206) in response to depression of F9. 

The student may affect the volume of audio pro- 
duced by speech processor 60 in the preferred em- 
bodiment by depressing the right arrow key (to in- 
crease the volume level) or the left arrow key (to de- 
crease volume level). Volume is controlled by writing 
a new volume level byte to speech processor 60 
(blocks 208, 212) in a conventional fashion. The pre* 
ferred embodiment also displays the current volume 
level on the lower right-hand portion of display 58 in 
the form of a horizontal bar the length of which indi- 
cates volume level (blocks 210, 214). 

The student may select whether or not he wishes 
display 58 to display the page one text corresponding 
to the spoken passage by depressing the F7 key. 
"Text off is the default condition. If the student de- 
presses the F7 key when the text is already displayed 
(decision block 21 6), the text off f lag is set (block 218) 
to suppress the display of text If the student depress- 
es the F7 key when system 50 is not displaying text 
(decision block 216), a text on flag is set (block 220) 
to result in display of the complete page one text as- 
sociated with the current lesson. 

When the student depresses the F2 key, he caus- 
es the entire text of the lesson to be displayed on dis- 
play (but only if the text on flag is set by block 220; 
blocks 222, 224). System 50 then controls speech 
processor 60 to reproduce the audio corresponding to 
the lesson by reading digitized speech information 
from mass storage device 56 and converting it to au- 
dio signals. In the preferred embodiment, digitized 
speech is stored on mass storage device 56 in the 
form of separate and discrete phrases. Files are 
packed 4-bit ADPCM sound data in the preferred em- 
bodiment Speech processor 60 in the preferred em- 
bodiment accepts a string of several file names to be 
played in sequence. Each separate recorded phrase 
file is loaded and played with an interval of 0.25 sec- 
onds between to give the impression of continuous 
replay. Up to 186 seconds of audio can be played 
from a single floppy diskette, and up to twenty-three 
hours may be played from a CD ROM storage device. 
In the preferred embodiment the actual mechanism 
for presenting digitized speech to speech processor 
60 includes reading digital information from mass 
storage device 56. Spe ch processor 60 then reads 
the digitized information into its own 32K buffer and 
converts the information to audio form. When speech 
processor 60 reaches the end of th data stored in its 
buffer, it automatically generates an interrupt requ st 



which is serviced by a conv ntional interrupt handler 
performed by the processor of personal computer 52. 
This conventional interrupt handler (which is provided 
with conventional speech processor 60) reads the 

5 next portion of the digitized speech file from mass 
storage device 56 and transfers the data to speech 
processor 60. Since the transfer of information is per- 
formed under interrupt control for only small blocks of 
data at a time, the process is virtually transparent to 

10 the user and results in only a negligible slowing of the 
response time of personal computer 52. 

Once system 50 begins reproducing the audio 
corresponding to a particular lesson passage (block 
226), it continues to produce the entire audio passage 

15 until it reaches the end of the passage or until the stu- 
dent depresses the ESC key (decision block 228). 
Upon the occurrence of either of these two events, 
display 58 is cleared and a command line is displayed 
to permit the user to select another option (block 230). 

20 The user may at any time depress the enter key 
to enter the LAB mode (blocks 194-230 being part of 
the PREVIEW mode discussed previously). Upon en- 
tering the LAB mode, system 50 first determines 
whether text display is on or off (e.g., by testing the 

25 value of the text on flag (decision block 232). If text 
display is off, system 50 displays the "current phrase" 
on display 58 (block 234) - that is, the phrase that 
was being "played back" while in the PREVIEW 
mode. PREVIEW mode plays all the phrases. When 

30 the student first enters Text Lab mode, the first 
phrase is the current mode. System 50 then waits for 
the student to depress a control key to select one of 
the LAB options (block 236, decode block 238). 
The LAB mode provides its own help screen sup- 

35 port upon depression of the F1 key (blocks 240-244), 
and permits the user to exit back to the PREVIEW 
upon depressing the ESC key (block 246, with control 
being returned back to FIGURE 7A block 188). Sim- 
ilarly, depressing the F9 key causes a "page two" dis- 

40 play screen format to be displayed on display 68 if 
such a "page two" format exists (blocks 248-254). 

If the user depresses the F2 key, system 50 re- 
produces to audio corresponding to the current 
phrase (block 256). Moving the cursor control keys 

45 down arrow or right arrow cause system 50 to select 
the "next phrase" (that is, the next file in a sequence 
of files that store the digitized speech phrases corre- 
sponding to the current lesson) while moving the up 
arrow or left arrow cursor control keys causes selec- 

50 tion of the previous phrase (blocks 258, 264) (blocks 
260, 262, 266, 268). The F6 key causes a phrase se- 
lected by the cursor control keys to be turned on (i.e., 
flagged) and turned off (i.e., unflagged) (blocks 270- 
274). A phrase that is turned on is displayed, whil 

55 turning off a phrase causes that phras to cease be- 
ing display d (block 276). The student may at any 
time turn display of the compl te main text on and off 
by depressing the F7 k y (blocks 278-284). 

10 s 
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If th phrase flag is on, each phras will be dis- 
played as It is needed, ev n If text is off. If the phras 
flag is off, ach phrase will be erased as it is reached, 
even if text is on. The student may also at any time 
depress the F3 key in the preferred embodimentto re- s 
cord his own voice. When the F3 key is depressed, 
system 50 first gives the student the option to in- 
crease or decrease record level gain using the left ar- 
row and right arrow cursor control keys (block 286, 
decode block 288, blocks 290, 292). If the student al- 10 
ters the record gain, the new gain is displayed in the 
lower right-hand corner of display 58 (blocks 294, 
296) and the new record gain level is written to 
speech processor 60 in a conventional manner. The 
student depresses the space bar to begin the record- 15 
ing process (block 298). When recording is begun, 
speech processor 60 is controlled to begin converting 
signals at its audio input into digitized speech signals 
and writing those digitized speech signals onto the 
virtual disk. This process continues until either the 20 
user depresses the space bar again to terminate re- 
cording or until a preset recording time (the length in 
time of the model phrase in the preferred embodi- 
ment) has elapsed (block 302). A record flag is then 
set (block 304) to indicate that a phrase has been re- 25 
corded, and the command line is displayed once 
again (block 306). If the student now depresses the 
F4 key to playback his recorded phrase, it is first de- 
termined whether the record flag has been set (deci- 
sion block 308) - and if it has been set, system 50 30 
controls speech processor 60 to convert the student's 
stored digitized speech into audio (block 310). 

The LAB mode thus permits the student to con- 
centrate on a specific phrase from the prerecorded 
spoken lesson. If the student has trouble with a par- 35 
ticular phrase, however, he may wish to listen to small 
pieces of that phrase in isolation (e.g., one syllable at 
a time) so he can learn how to speak the entire 
phrase. The preferred embodiment of the present in- 
vention allows the student to concentrate on any por- 40 
tion of the current phrase by depressing the F5 key to 
enter the CLIP mode. Upon entering the CUP mode, 
system 50 displays a "clip" line (a horizontal line at the 
bottom of the display indicating the length and posi- 
tion of the current "clip" relative to the current phrase 45 
display) and a new command line (block 312) and 
then waits for the student to depress a key. Depres- 
sion of the ESC key deletes the CUP line and returns 
to FIGURE 7B block 236. A help screen is provided for 
the CLIP mode (block 318-322), and the CLIP mode so 
also permits the user to play the current phrase from 
beginning to end by depressing the F2 key (block 
324). Similarly, the student may record and play back 
his own spe ch just as in the LAB mode (blocks 326- 
348) by depressing the F3 and F4 k ys, resp ctiv ly. 55 

Briefly, the clip mode provides two ind xes into 
the digitized speech file r lating to th currently se- 
lected phrase: r (th "right-hand pointer" - which 



points to the end of the "clip"); and 1 (the "left-hand 
pointer" - which points to the b ginning of the clip). 
Th right-hand pointer r is incremented and decre- 
mented by the right-arrow and down-arrow cursor 
control keys, respectively between the values of L 
(the beginning of the current phrase) and R (the end 
of the current phrase). Right-hand pointer r points to 
the end of the portion of the digitized speech phrase 
that is selected (blocks 350-354, 368-372). The left- 
hand pointer 1 is decremented and incremented by 
the left-arrow and up-arrow cursor control keys, re- 
spectively between the values of L and R (which thus 
set a range for both I and r - 1 and r cannot pass each 
other). The left-hand pointer I points to the beginning 
of the "clip" (blocks 356-366). 

In the preferred embodiment, the left-hand poin- 
ter I and the right-hand pointer r define absolute time 
offsets into the file containing digitized data repre- 
senting the current phrase. Thus, depressing the up- 
arrow key moves the left-hand pointer I to the rig ht (to- 
ward the end of the phrase); depressing the leftarrow 
key moves the left-hand pointer I to the left (toward 
the beginning of the phrase); depressing the downar- 
row key moves the right-hand pointer r to the left (to- 
ward the beginning of the phrase); and depressing the 
rightarrow key moves the right-hand pointer r to the 
right (toward the end of the phrase). 

In the preferred embodiment, the "clip" mode 
works on the basis of time. That is, system 50 controls 
the speech processor 60 (and associated disk read in- 
terrupt routine) to seek directly to the point in the 
phrase file pointed to by the left-hand pointer 1 and 
to begin playing back the file from that point until the 
point pointed to by the right-hand pointer r (at which 
point the play back ceases) (block 378). The effect is 
that the user can select and "play back" any arbitrarily 
small portion of the current phrase (within the range 
of resolution of variables 1 and r- 0.1 seconds in the 
preferred embodiment) without having to hear the re- 
maining part of the phrase (and also without having 
to wait for the delays during which the remaining por- 
tions of the phrase would be played back). In the pre- 
ferred embodiment, the CLIP mode is more than 
merely a "mute" function since it actually presents 
only the desired digitized speech data to speech proc- 
essor 60 for conversion to audio signals. 

FIGURES 8A-8B are together a flow chart of ex- 
emplary program control steps performed by system 
50 to implement the AudioWrite function shown in 
FIGURE 6. When the student selects the AudioWrite 
function (see FIGURE 6, blocks 172, 176), instruc- 
tions in a command line are displayed (block 380) and 
then system waits for the student to select one of the 
options pres nted to him by the AudioWrite function. 
Depressing th ESC key causes control to return to 
the main routin (FIGURE 6, block 170). The student 
may depress the F2 key to play back the audio corre- 
sponding to the current lesson (blocks 386, 388). De- 
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pressing the F1 k y or th nter key causes system 
50 to display a command line (block 390) and then 
play back th first phrase from the current lesson 
(block 392). System 50 then waits for the student to in- 
put either the words matching the phrase he just heard 
or a control key (block 394). If a word from the model text 
is displayed upon depressing this key, this word is re- 
moved from the display (blocks 396-100) - and like- 
wise, display of the elapsed time is suppressed if the 
elapsed time is displayed when the next key is de- 
pressed (blocks 402'and 406). 

The code block 408 then determines which key 
the user has depressed. If the user depresses the F3 
key, the time that has elapsed since the AudioWrite 
exercise began is displayed in a conventional manner 
(block 41 0). The cursor control keys cause the cursor 
to move to the right or the left on display 58 (blocks 
412-418). 

The "object" of the AudioWrite exercise is for the 
student to input alphanumeric characters which 
match the phrase he is hearing from speech proces- 
sor 60 (and thus also the textual version of the same 
phrase from the main text). If the student inputs an al- 
phanumeric key, the character corresponding to the 
key he inputs is displayed on the display and the cur- 
sor is moved one character to the right (blocks 420, 
422). Block 420 also causes the character corre- 
sponding to the key depressed by the user to be add- 
ed to a text string buffer for analysis when the user de- 
presses the enter key. System 50 automatically "fills 
in" spaces and punctuation and changes the case of 
the displayed characters if necessary to match the 
"model 0 text If the user depresses the delete key, the 
character displayed immediately to the left of the cur- 
sor is deleted from the display (and also from the text 
string buffer) (blocks 424, 426). If the student de- 
presses the F9 key, a word from the model text cor- 
responding to the exercise is displayed in the lower 
right-hand corner of display 58 in the preferred em- 
bodiment (blocks 428, 430). Depressing the enter key 
causes system 50 to check the user inputted contents 
of the text string buffer against the model text string 
(character by character) and indicate errors in the 
user inputted string -- as will now be explained. 

Upon depressing the enter key, system 50 first 
displays the elapsed time in the lower right-hand cor- 
ner of display 50 (block 432). System 50 then scans 
through the student inputted text string buffer one 
character at a time beginning with the first character 
in the buffer (block 434). System 50 compares, for ex- 
ample, the first character in the student inputted buf- 
fer with the first character of a model text string stored 
on mass storage device 56 corresponding to the cur- 
rent phrase. If these two characters correspond, no 
action is taken (decision block 436). On the other 
hand, if th characters do not correspond, the display 
of the first character is highlighted on display 58 
(block 438). This process (blocks 434-438) continues 



until all letters in the student inputted text string buf- 
fer have been compared with the model text string 
characters (spaces and punctuation being ignored). If 
any letters are incorrect (decision block 440), system 

5 50 moves the cursor to the beginning of the first word 
that has a wrong character to permit the student to 
correct his error (block 442). If all characters of the 
student inputted text string correspond exactly to the 
characters in the model text string (meaning that the 

10 student-inputted string is both entirely correct and 
complete), system 50 waits for enter to be depressed, 
then advances to the next phrase (block 444) and re- 
peats blocks 392-442 for that next phrase. If the en- 
tire lesson has been analyzed (as tested for by deci- 

15 sion block 446), an end of lesson message is dis- 
played (block 448) and upon inputting another key 
(wait block 450) control returns to FIGURE 6 block 
170. 

FIGURES 9A-9C are together a detailed flow 

20 chart of exemplary program control steps performed 
by system 50 to implement the SoundSort function of 
the preferred embodiment of the present invention. 
As will be recalled from the discussion above, the 
SoundSort function presents the student with a game 

25 in which he is expected to move symbols on display 
58 corresponding to phrases of a sentence or pas- 
sage into the correct order (after system 50 has reor- 
dered the phrases into a random order). 

Upon initiating the SoundSort routine (from de- 

30 code block 172, FIGURE 6), it is first determined 
(e.g., by checking a flag) if this is the first time the stu- 
dent has used SoundSort in this session (decision 
block 452). If the current execution is the first time of 
use, an introductory screen explaining how to play 

35 the SoundSort game is displayed by display 58 
(blocks 454, 456). System 50 then accesses a sen- 
tence or passage of the lesson stored on mass stor- 
age device 56, this sentence including plural phrases. 
In the preferred embodiment, only passages with a 

40 relatively small number of (e.g., a maximum of 21) 
phrases are especially suitable for SoundSort since 
the SoundSort function uses the first 21 phrases of a 
given lesson (additional phrases are ignored). Sound- 
Sort then randomizes the sequence of phrases within 

45 the lesson (e.g., using a conventional pseudo-ran- 
domizing algorithm) to provide a randomized ("jum- 
bled") sequence of the original phrase sequence. 

System 50 then assigns a symbol (alphabetical 
letters in the preferred embodiment) to each one of 

so the random-order phrases (block 458). For example, 
suppose the four phrase sequence involved is: "Cats" 
"have" "four" "legs." With each word being a discrete 
phrase, block 458 might randomize the order of the 
phrases to result in: "Four" "legs" "cats" "have", and 

55 then assign th symbol A to symbolize th phras 
"four"; the symbol B to symboliz the phrase "legs"; 
the symbol "C" to symbolize the phrase "cats"; and 
the symbol "D" to symbolize th phrase "have". Sys- 
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tern 50 then displays on display scr en 58 th sym- 
bols corresponding to the reordered phrase sequence 
in the left-hand column of the display (see FIGURE 5) 
so that the phrases, if heard in the order of the sym- 
bols displayed on the display, would be in the random- 5 
ized order (block 460). System 50 then waits for the 
student to depress a key to select the next function 
to be performed (blocks 462, 464). 

The SoundSort function 178 in the preferred em- 
bodiment provides a help screen giving the student 10 
instructions for what to do next if he gets confused 
(blocks 472-476). The student may exit the Sound- 
Sort function 178 at any time by depressing the ESC 
key (decode block 464). If the student confirms he 
wishes to leave the SoundSort function, a return to 15 
main routine block 170, FIGURE 6A is performed 
(blocks 468, 470). On the other hand, if the student 
does not confirm he wishes to leave the SoundSort 
function, he is returned back to the get key block 462 
to select the next function (decision block 468). 20 

If the student depresses the F3 key in the prefer- 
red embodiment, system 50 plays back the phrase as- 
sociated with the symbol the cursor is presently point- 
ing to (block 478). Upon depressing the F2 key, sys- 
tem 50 displays on display 58 a prompt which prompts 25 
the user for "starting point?" (block 480), and then 
waits for the user to input another selection (blocks 
482, 484). By striking the F2 key, the student may 
playback the entire sequence of phrases in their cor- 
rect order - or can select a portion of the correctly or- 30 
dered sequence of phrases to hear the audio corre- 
sponding to. After the depressing the F2 key, if the 
student again depresses F2 (or depresses the EN- 
TER key), system 50 plays back a phrase sequence 
beginning at a portion of the sequence pointed to by 35 
a pointer called a "start point" which is initially set at 
the beginning of the correctly ordered phrase se- 
quence (but may be changed by the student) (block 
494). Once the phrase sequence playback has be- 
gun, it will continue to the end of the sequence of 40 
phrases or until the student again hits the F2 (deci- 
sion 496). If, instead of striking the F2 key or the EN- 
TER key, the student depresses the left arrow or right 
arrow keys in the preferred embodiment, the effect 
will be to change the value of start point In particular, 45 
if the student depresses the right arrow key, the start 
point pointer value is advanced in the phrase se- 
quence and its new value is displayed (blocks 486, 
488). Similarly, by depressing the left arrow key, the 
start point pointer value is retracted toward the begin- so 
ning of the phrase sequence (blocks 490, 492). This 
allows the student to concentrate on the last portion 
of the correctly ordered phrase sequence, for exam- 
ple, (or on any portion of the phras sequence since 
he can strike the F2 k y to discontinue phrase se- 55 
quence playback) and is especially useful for long 
phrase sequences since it permits the student to lis- 
ten to three or four phrases in the sequence, for ex- 



ample, rather than the entire sequence (which may b 
of arbitrary length). 

In the preferred embodiment, the left arrow and 
right arrow cursor control keys only have the effect of 
changing the phrase sequence playback beginning 
point after the F2 key has been depressed. Other- 
wise, they control movement of the displayed symbols 
on display screen 58. Depressing the right arrow key 
causes system 50 to first determine whether the curson 
is in the left column or the center column (decision 
blocks 498, 502, respectively). The objective in the 
SoundSort function is to move the symbols displayed in 
the left column further to a center column - and then to 
move those symbols into a right-hand column in the cor- 
rect order based upon aural clues. If the cursor is in the 
left column (and thus is pointing to a symbol displayed 
in the left column), and the user depresses the right 
arrow key, the symbol pointed to by the cursor is re- 
moved from the left column and displayed in the cen- 
ter column (block 500), using conventional screen 
control techniques. Similarly, if the cursor is pointing 
to a symbol in the middle column and the user de- 
presses the right arrow key, the symbol is moved to 
a right column position so long as there isn't already 
a symbol displayed immediately to the right in the 
right column (decision block 502, 506). Striking the 
left arrow key permits the student to move a symbol 
in the right column back to the middle column or from 
the middle column back to the left column (blocks 
510-520). 

In the preferred embodiment, the student 
changes the order of symbols by moving them to the 
center column and then moving them vertically before 
placing them into "slots" in the right column (these 
slots correspond to entries in an array maintained in 
memory). Upon depressing the up arrow key, for ex- 
ample, system 50 first determines whether the cursor 
is pointing to a symbol in the center column (decision 
block 522). If so, the pointed to symbol is moved up 
one row in the center column (thus, the symbol is al- 
ready in the top row in which case it is wrapped 
around to the bottom) (blocks 524-528). If the cursor 
points to a symbol in the left-hand or right-hand col- 
umn, on the other hand, the cursor is moved up one 
row (block 530) and then system 50 determines 
whether the new cursor position is on the letter in the 
left or right column (decision block 532). If the cursor 
does not point to a letter in its new position, it is either 
moved up or wrapped around (decision block 534, 
536). Similar symbol movement occurs upon de- 
pressing the down arrow key (blocks 540-560). 

Depressing the space bar in the preferred em- 
bodiment controls the cursor to move between left 
and right columns. For example, if th r ar symbols 
display d in both the left column and the right column 
and the cursor is presently in the center column, strik- 
ing the space bar will do nothing. Space only has an 
effect if the cursor is in either the left or right column. 



13 



25 



EP 0 461 127 B1 



26 



In the preferred embodiment, the space bar wilt nly 
move the cursor to columns where symbols are dis- 
played. It only moves the cursor between the left and 
the right columns (and is ignored when the cursor is 
in the center column), and always results in the cursor 5 
pointing to the uppermost symbol in the new column 
(blocks 562, 564, 566). 

Depressing the ENTER key controls system 50 to 
check the right column entries to determine which 
ones are correct and which ones are incorrect so that 10 
the student can monitor his progress. Upon depress- 
ing the ENTER key, system 50 examines the contents 
of the right column positions one at a time (block 568). 
If the student has moved a symbol into a certain pos- 
ition, system 50 compares that symbol with a symbol is 
order string (array) it formed at block 458 indicating 
the correct order of the symbols (decision block 570). 
If the symbol under examination in the right-hand col- 
umn corresponds to the symbol order (array) in the 
model symbol string, it is marked on the display as be- 20 
ing correct (blocks 572, 576). If the symbol is incor- 
rect, it is marked on the display as being wrong (e.g., 
by highlighting) (blocks 572, 574). After all of the right- 
hand column positions have been marked correct or 
incorrect by blocks 570-576, system 50 determines 25 
whether any one has been marked incorrect (decision 
block 578). If at least one symbol in the right-hand col- 
umn is wrong (or missing), an elapsed time indicator 
is displayed along with new command lines and sys- 
tem 50 then waits for the student to depress a key 30 
(blocks 580, 582). Upon depressing a key, the work 
screen is restored to permit the student to continue 
moving the symbols (blocks 584, 462, 464). If, on the 
other hand, the student has successfully moved all of 
the symbols to the right-hand column in the correct or- 35 
der, an end message is displayed (block 586) and 
control returns to main routine (blocks 170, 172). 

STUDIO ROUTINE 

40 

FIGURES 10-16 describe utilities provided by the 
presently preferred exemplary embodiment of the 
present invention to permit a teacher to form and/or 
customize lessons and exercises for use by students. 
FIGURE 1 0 is a high-level flow chart-type diagram of 45 
the user interface presented to the teacher. Upon 
starting the studio routine in the preferred embodi- 
ment, the title screen is displayed along with options 
available to the teacher (block 600). In the preferred 
embodiment, the teacher may select between four 50 
different options (1) an instruction display; (2) a Text 
Writer word processor-type function; (3) an AudioLab 
studio function; (4) exit In the preferred embodiment, 
s I cting option numb r 1 displays an instruction 
screen (block 604) in which the teacher is told about 55 
a suggested general methodology for using the Au- 
dioLab studio and Text Writer functions. Briefly, the 
teacher gen rally first uses th Text Writer function to 



type in one or more screens of text the students are 
to view on the screen during the lesson - including 
the page one and page two screens described previ- 
ously. The page one screen generally is (but need not 
be) the textual version of the recorded audio. The 
page two screen may be help or instructions. After 
the Text Writer function (block 606) has been used to 
input one or two screens of text, the AudioLab studio 
function is used to convert spoken audio into digitized 
speech phrase files stored on mass storage device 56 
and to associate that recorded audio with the text pre- 
viously entered using the Text Writer function. Specif- 
ically, the teacher first chooses recording text (block 
608) and then may choose whether of not to include 
a page two help screen (block 610, 612). The teacher 
then marks and records phrases using speech proc- 
essor 60 (block 6 1 4), and is finally permitted to select 
student menu layout for the lesson (block 616). 

In the preferred embodiment, in the TextWriter 
routine (block 606) a format different from ASCII is 
used for convenience and a utility is provided for con- 
verting from ASCII to the different format Preferably, 
the text files created by the Text Writer program rou- 
tine (block 606) are of limited length so that they can 
each fit on a single display screen (80x21). 

FIGURE 11Aisa detailed flowchart of exemplary 
program control steps performed by the studio rou- 
tine shown in FIGURE 10. As mentioned previously, 
upon initiating this studio routine, the title screen is 
displayed (block 600) and then the keyboard input is 
decoded to allow the teacher to select one of four op- 
tions (block 602). Instructions may be displayed, the 
text writer conventional word processor may be called 
(block 606) or the studio routine may be exited. Once 
the teacher has inputted one or two text screens us- 
ing the Text Writer word processor, he may select the 
AudioLab studio routine to actually assemble the tex- 
tual and audio components of a lesson (beginning at 
block 620). Upon selecting the AudioLab studio func- 
tion, system 50 first calls a select file routine (named 
"selfile" in the preferred embodiment) to choose a 
main text format to be associated with the lesson. A 
detailed flow chart of the select file routine 620 is 
shown in FIGURE 12. 

Upon initiating the select file routine 620, system 
50 first displays a command line (block 622) and then 
calls a routine called "choose drive" 624 to permit the 
teacher to select which of several drives he wishes to 
use. As is well known, personal computer 52 may 
have one or more hard disk drives and one or more 
floppy diskette drives (all of which are shown sche- 
matically in FIGURE 1 as mass storage block 56). 
Generally, the teacher wishes to store lessons on 
floppy diskettes so that they can be asily copied and 
distributed to students. The choose drive routin 624 
(shown in detail in FIGURE 13) uses conv ntional 
MS/DOS utilities in th preferred embodiment to 
count the number of disk drives (block 626), th n 
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clears th display screen 58 (block 628) and then dis- 
plays a window setting forth th drive designations of 
each of the existing mass storage driv s (blocks 630, 
632). The personal computer 52 keyboard buffer is 
then cleared (block 634) and system 50 prompts for s 
the teacher's choice (block 636-642). Depressing the 
F1 key displays help text (block 644-648). If the teach- 
er depresses the up arrow or down arrow cursor con- 
trol keys, different drive designation options dis- 
played by display 58 are highlighted so as to permit 10 
the teacher to alter the drive selection in a convention- 
al manner (blocks 650-660). If the teacher keys in a 
valid drive designation letter rather than using the 
cursor controls, that value is selected as the designat- 
ed drive (block 662). Otherwise, depressing the EN- 15 
TER key causes the drive designation selected by the 
cursor control keys to be selected. Upon depressing 
the ENTER key, system 50 first determines whether 
the A or B floppy diskette drives have been chosen 
(decision block 664). If not, then a hard disk has been 20 
selected and the hard disk designation is returned 
(block 666). If a floppy diskette drive has been select- 
ed, (decision block. 6664), the keyboard buffer is 
cleared once again (block 668) and the system 
prompts the teacher to insert a diskette in the diskette 25 
drive (blocks 670, 672). Striking the F1 key at this 
point displays help text (block 674-680). If the teacher 
depresses the ESC key, the routine is aborted (deci- 
sion block 682, 684). If any other key is depressed, 
routine key 624 returns to FIGURE 12 block 686, with 30 
the A or B drive designation selected (block 666). 

Referring once again to FIGURE 12, system 50 
then determines the reason why the choose drive rou- 
tine 624 was exited. If the reason was because the 
teacher depressed the ESC key at block 638, control 35 
returns to FIGURE 11 A block 620 with a null return 
string (decision block 686, 688). If, on the other hand, 
the teacher depressed the ESC key at FIGURE 15 
block 682, routine 624 is called again to permit choos- 
ing of another drive (decision block 690). ao 

In the preferred embodiment, all files associated 
with a particular lesson are preferably collected within 
a common subdirectory. The teacher may create the 
subdirectory before initiating the FIGURE 10 routine 
using conventional DOS utilities, or a conventional 45 
create directory routine may be included in the select 
file routine to permit the teacher to create a subdirec- 
tory on the file. 

Oncea valid subdirectory exists on mass storage 
device 56, system 50 permits the teacher to select be- so 
tween different subdirectories that may exist if multi- 
ple subdirectories exist Specifically, the flag USE- 
DIRS is set by system 50 whenever at least one user 
subdirectory exist on mass storage device 56 (block 
746, 744). A flag CHDRIVES is set to eventually re- 55 
quire the user to choose anoth r drive using the 
choose drive routine 624 (blocks 748-752) if no valid 
subdirectory exists. Otherwis , the flag CHDIRS is 



set to 0 (block 752) and decision bl ck 754 d termi- 
nes whether it is necessary for the teacher to select 
between user subdirectories (e.g., if more than one 
subdirectory exist). If subdirectory selection is re- 
quired, a routine DIRMENU 756 is called to permit 
subdirectory selection. A detailed flow chart of this 
routine 756 is shown in FIGURE 14. 

The DIRMENU routine 756 first uses convention- 
al DOS utilities to find and record all subdirectories on 
mass storage device 56 (block 758). This option is not 
available in Studio. The new lesson is always saved 
on a floppy disk (drive A). So long as additional sub- 
directories can be created, the teacher is given the 
option to create a new subdirectory for the new lesson 
(blocks 760, 762). Next, instructions and a list of all 
of the subdirectory names existing on mass storage 
device 56 are displayed (block 764), and system 50 
then awaits user input (blocks 756-774). The cursor 
control keys are used to highlight different displayed 
subdirectory names in a conventional manner (blocks 
776-782), and a conventional help facility is also pro- 
vided (blocks 784-790). Upon depressing the ENTER 
key, system 50 determines whether selected the op- 
tion to create a new subdirectory (decision block 792), 
and if so, may create a new subdirectory in an entirely 
conventional manner using the DOS "MKDIR n utility 
or the like. Otherwise, the subdirectory name is stor- 
ed (block 794) and a return to routine 620 is per- 
formed (block 796). 

The teacher may depress the ESC key at any 
time to select another drive or another diskette, (and 
thus call the choose drive routine 624) (decision block 
798). Otherwise, the teacher is permitted to select be- 
tween files within the selected subdirectory using the 
FILEMENU routine 804. A detailed schematic flow 
chart of the exemplary program control steps related 
to the FILEMENU routine 804 is shown in FIGURES 
15A-15B. 

Referring now to FIGURE 15A, routine 804 first 
scans the selected subdirectory (using conventional 
DOS utilities for all files having the extension ".tit™ 
(block 806). If more than the number of such files ex- 
ist than will fit on the display, a warning message is 
displayed (block 808-812). If no such files exist, sys- 
tem 50 determines whether the teacher has mistak- 
enly removed the diskette from the drive (block 814, 
816) and if he has displays an error message (block 
818). If no such files exist and the diskette is still with- 
in the drive, an error message indicating that no text 
files exist is displayed (block 820). 

If decision block 814 determines that some ".tit" 
files do exist, system 50 displays instructions and a 
list of the Mlt** files (block 820) and then permits the 
t acher to select on of the listed files (blocks 822- 
828). By manipulating the cursor control keys, the 
teacher can highlight any s lected file nam dis- 
played on display 58 (blocks 830-836), and may se- 
lect the highlighted file by depressing the ENTER key 
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(block 838, 840). Depressing th ESC key xits rou- 
tine 804 without selecting a file nam (block 842). 

Referring now once again to FIGURE 12, if th 
teacher failed to select a file name (determined by de- 
cision block 844), the flags are set appropriately s 
(block 846) to permit the teacher to select another 
subdirectory (decision block 848, blocks 752, 756). 
Otherwise, the selected file name is returned at block 
850 to FIGURE 11A block 852. 

FIGURE 11A block 852 reads the text of the se- 10 
lected file and displays it on display 58 (blocks 852, 
854). System 50 then prompts the teacher whether he 
wishes to accept the text (blocks 856, 858). If he does 
not accept the text, routine 620 is called again to per- 
mit him to select another file. Otherwise, the teacher 15 
is prompted to enter a new lesson title (block 860) and 
is asked whether he wishes to include a second page 
of help or instructions in the lesson (block 862, 864). 
If a page two screen is to be included, routine 620 is 
called to permit selection of the file containing the 20 
page two text and the teacher is given the opportunity 
to view and accept this page two text (blocks 866, 
872). The second page need not necessarily be relat- 
ed to either the main page of text or the recorded 
speech - permitting great flexibility to the teacher in 25 
creating lesson formats. However, the page two 
screen typically is supplementary textual material or 
instructions which may be displayed by the student 
upon depressing a key. 

Once both the main text screen and the page two 30 
text screens have been selected and accepted by the 
teacher, system 50 prompts the teacher to insert a 
data disk (block 874, 876) which should preferably be 
blank in the preferred embodiment to provide suffi- 
cient room (e.g., minimum 360K) for storing digitized 35 
speech corresponding to the lesson being created 
(blocks 878-884). System 50 in the preferred embodi- 
ment thus insists that a blank format is used for each 
lesson to ensure that recordings are transferred 
whole onto the diskette and thus decrease access 40 
time (by eliminating searches for different related 
speech files). 

System 50 then displays once again the main text 
screen selected by routine 620 and accepted at 
blocks 856, 858 (block 886) and waits for the teacher 45 
to select a string of text on the display (blocks 888, 
890). Briefly, in the preferred embodiment, the teach- 
er first selects a phrase using the cursor control keys 
and then records digitized speech corresponding to 
the phrase using the F4 key. The teacher may re-re- so 
cord a given phrase if necessary. Depressing the EN- 
TER key causes system 50 to move on to the next 
phrase. The and ">" may be used to skip over dis- 
play d words the t acher does not wish to record. 
Each recorded phrase may be up to ten s conds long. 65 

Blocks 892, 894 are used t skip over displayed 
words (so that not all words of the main text screen 
need to correspond to a recorded phrase). The left ar- 



row and right arrow keys are used to length n and 
shorten the currently select dphras , with the select- 
d phrase being highlighted on the display to permit 
the teacher to see what phrase he has selected 
(blocks 896-902). Depressing the F3 key causes sys- 
tem 50 to determine whether a digitized speech file 
corresponding to the currently selected phrase has 
already been recorded (decision block 904). If one 
has been recorded, that recording can be played 
(block 906) - allowing the teacher to hear what he has 
recorded corresponding to the phrase. Depressing 
the F4 key allows the teacher to record (or re-record) 
up to ten seconds of digitized speech corresponding 
to the selected phrase (blocks 920-926). Once a 
phrase has been recorded, it is stored on the blank 
formatted diskette and the teacher is given an indica- 
tion of the amount of free diskspace remaining in sec- 
onds (blocks 928, 930). During recording, volume lev- 
els continuously displayed and recording time is also 
displayed in seconds. The teacher speaks into head- 
set 62a microphone (or separate microphone 62b), 
and speech processor 60 converts his spoken 
speech into digitized data which is stored in the virtual 
disk of personal computer 52. Once the teacher again 
depresses the space bar (or there is a time out) (de- 
cision block 924), the stored information stored in the 
virtual disk is transferred to the floppy diskette or 
hard disk. This procedure greatly increases sound 
quality because final storage is not to take place on 
an interrupt driven basis. 

In the preferred embodiment, each digitized 
speech phrase is provided with a unique name. Spe- 
cifically, each digitized phrase file is automatically 
numbered in the preferred embodiment with sequen- 
tially ascending numbers (e.g., textl.SO, text2.SO, 
etc.) and the start and end of each text phrase is flag- 
ged in the main text file corresponding to the lesson. 
For example, a character sequence such as "text- 
start(1 )° may be added to the text file at the point the 
teacher marked as corresponding to the first record- 
ed phrase, and a character sequence such as "tex- 
tend(1)" may be added to the text file at the point the 
teacher selected as the end of the corresponding text 
phrase. In this way, a linkage is established between 
teacher-selected text strings within the main text file 
and discrete files stored on mass storage device 56 
containing corresponding digitized text - with a one- 
to-one correspondence generally existing between 
text strings and digitized sound files in the preferred 
embodiment. The backspace key allows the teacher 
to easily move to a previously recorded phrase in or- 
der to re-record it or the like (block 932, 934). De- 
pressing the ENTER key causes system 50 to go on 
to th next phrase if the previous phras has been re- 
corded (decision block 936) - or if ail phrases have 
been recorded, to move on to block 938 (which per- 
mits th teacher to listen to the ntir recorded 
speech on an uninterrupted basis or to depress th 
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nt r k y to sav the lesson (see blocks 940-944). 

Finally, now that the teacher has stored the les- 
son he can preprogram which of the studentf unctions 
will be available to the students for a particular les- 
son .I n the preferred embodiment, the AudioLab func- 
tion is always available to the student. However, cer- 
tain exercises may not be suitable for the SoundSort 
or AudioWrite function. At block 946, system 50 
prompts the teacher to choose student functions that 
should be available to students and to delete func- 
tions that should not be available to the student. 
Blocks 950-958 result in displaying the same main 
menu seen by the student and permitting the teacher 
to delete one or both of the AudioWrite or SoundSort 
options or to undelete those options (blocks 956, 954, 
respectively). Depressing the ENTER key saves the 
student selection data (block 958) and returns control 
to FIGURE 1 1 A block 602 to permit the teacher to eith- 
er exit the studio routine or to work on creating an- 
other lesson. 

The present invention thus provides an extremely 
flexible environment for creating preprogrammed au- 
dio visual lessons in which both the audio portion and 
the textual portion can be programmed by the teach- 
er. Once lessons have been constructed in this fash- 
ion, they can be used by the student in a variety of dif- 
ferent ways to develop different skills. For example, 
the Audiolab student function works on reading and 
listening comprehension; the AudioWrite function 
concentrates on listening, comprehension and writing 
skills; while the sound functions concentrates on lis- 
tening, comprehension, grammatical and other skills. 
Since the same lesson can be used for various func- 
tions, the burden on the teacher is eased, while great 
flexibility is maintained. All of these features are pro- 
vided by a truly interactive language learning system 
in which the student is exposed to both audio and vis- 
ual stimuli and is capable of either listening to record- 
ed model digitized speech and/or to his own attempts 
to pronounce the speech using self-correction meth- 
odology. 

While the invention has been described in con- 
nection with what Is presently considered to be the 
most practical and preferred embodiments, it is to be 
understood that the invention is not to be limited to the 
disclosed embodiments, but on the contrary, is in- 
tended to cover various modifications and equivalent 
arrangements included within the scope of the ap- 
pended claims. 



Claims 

1. An interactiv language I arning system com- 
prising 

storing means (56) for storing in digital 
form data repr sentative of a model of speech 
v rsion of a passag of language and for storing 



data representative of user input speech, 

a display (58) for displaying visual informa- 
tion corresponding to the passage, and 

selecting means (54) operatively connect- 
5 ed to said display (58) and to said storing means 

and operative to provide a comparison of the user 
input speech with the stored model version of the 
passage of language, 

characterised in that 
10 said storing means (56) is arranged to 

store a digitised speech version of the passage of 
language and also to store a digital data text ver- 
sion of said same passage, 

said selecting means (54) is operable by a 
15 user to select a portion of said passage and to 
cause text corresponding to said selected pas- 
sage portion to be displayed on said display (58) 
based on said stored digital data text version, and 
in that the system includes speech processing 
20 means (60) for. 

selecting the portion of said stored digi- 
tised model speech version corresponding to 
said selected portion of said passage, 

converting said selected digitised model 
25 speech version portion to audio signals for use in 
generating speech sounds, 

converting audio signals representing 
user input speech into digitised signals repre- 
senting said user speech input and 
30 subsequently reconverting said digitised 

speech signals representing said user input 
speech to audio signals. 

2. A system according to claim 1 , wherein: said sys- 
35 tern further includes a transducer (62b) which 

converts user speech to audio signals; 

said speech processing means (60) in- 
cludes means connected to said transducer (62b) 
for converting said audio signals is digitized 
40 speech signals and for temporarily storing said 

digitized speech signals; 

said display (58) also displays a symbol; 

and 

said system further includes user input 
45 (54) means for permitting said user to (i) select 
said portion of said passage by manipulating the 
position of said symbol displayed by said display 
with respect to said displayed text, and (ii) control 
said speech processing means to rapidly alter- 
so nate (a) converting said temporarily stored digi- 
tized speech signals representing his own 
speech to audio signals, and (b) converting said 
digitized speech signals corresponding to said 
s lected portion of said passage to audio signals 
55 so as to alternately g nerate sounds correspond- 
ing to said user's spe ch and sounds correspond- 
ing to said stored digitized speech version. 



17 



33 



EP 0 461 127 B1 



34 



3. A system according to claim 1, wh r in: said 
speech proc ssing means (60) generates proc- 

ssor interrupts; and 

said system further includes interrupt 
means for reading said digitized speech version 
from said storing means for conversion to audio 
signals by said speech processing means in re- 
sponse to said generated processor interrupts. 

4. A system according to claim 1 , wherein: said se- 
lecting means (54) includes means for selecting 
the position and length of a portion of said pas- 
sage; 

said speech processing means (60) in- 
cludes further selecting means for selecting only 
those portions of said stored digitized speech 
version corresponding to said selected passage 
portion; and 

said speech processing means also in- 
cludes means for converting only said selected 
stored digitized speech version portions to audio 
signals. 

5. A system according to claim 1 , wherein said f irst- 
mentioned selecting means (54) comprises cur- 
sor control means manipulable by said user for 
selecting portions of said text displayed by said 
display and for thereby selecting corresponding 
portions of said stored digitized speech version 
for conversion to audio signals. 

6. A system according to claim 5, further including 
means connected to said cursor control means 
(54) and to said display (58) for causing said se- 
lected text portions displayed by said display to 
have a different appearance than the non-select- 
ed displayed text portions. 

7. A system according to claim 5, wherein said sys- 
tem further includes text display selection 
means, manipulable by said user and operatively 
connected to said display, for alternately select- 
ing: (a) display of only said selected text portions, 
and (b) display of the entire textual version of said 
passage including said selected text portion. 

8. A system according to claim 1, wherein said 
speech processing means (60) includes means 
for converting between audio signals and adap- 
tive differential pulse code modulation encoded 
digitized speech signals representing said audio 
signals. 

9. An int ractiv languag I arning system accord- 
ing to claim 1, wherein said system can provide 
in digital form data repr sentative of sp ech sig- 
nals, said digitized speech signals repr senting a 
sequence of spoken phras s having initial order; 



and said system inciud s 

re-ordering means (52) for re-ordering 
said plural phrases into a s quence having an or- 
der different from said initial order; 
5 said selecting means (54) operatively con- 

nected to said re-ordering means and operable 
by a user for permitting said user to further re-or- 
der said plural phrases into a user-specified or- 
der; and 

10 said speech processing means (60) is con- 

nected to said re-ordering means (52) and is re- 
sponsive to said digitized speech signals, for gen- 
erating audible versions of said phrases so as to 
provide audible cues to said user. 

15 

10, An interactive language learning system accord- 
ing to claim 9, further including a display (58) for 
displaying symbols representing said plural phas- 
es in at least said user-specified order. 

20 

11- An interactive language learning system accord- 
ing to claim 9, including symbol display means 
(58) connected to said re-ordering means (52) for 
associating a symbol with each of said plural 
25. phrases and for presenting a display of said sym- 
bols in said re-ordered sequence; 

testing means connected to said input 
means for comparing the user-selected re-or- 
dered sequence with said initial order. 

30 

12. A system according to claim 11, wherein: 

said selecting means (54) includes means 
for selecting any one of said symbols; and 

said speech processing means (60) is also 
35 connected to said user selecting means and to 
said symbol display means and includes means 
for converting the stored digitized speech asso- 
ciated with said selected symbol to audio signals. 

40 13. A system according to claim 11, wherein: 

said symbol display means (58) includes: 
a left-hand display column which displays 
said symbols in said first-mentioned re-ordered 
sequence, and 
*5 a right-hand display column which dis- 

plays said symbols in said user-specified further 
order; and 

said selecting means (54) includes means 
for moving said symbols from said left-hand dis- 
50 play column to said right-hand display column in 

response to user commands. 

14. A system according to claim 13, wherein said 
speech proc ssing means (60) is also connected 
55 to said user selecting means (54) and includes 
means for generating sounds corresponding to 
said phrases in said initial order. 
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15. A system according to claim 11, wherein said 
speech processing means (60) is also connected 
to said user selecting means (54) and includes 
means for selecting a starting point within said re- 
ordered sequence, said selected starting point s 
being different from the beginning of said se- 
quence, and for converting said corresponding 
digitized speech signals to audio signals in said 
initial order of said phrases beginning from said 
starting point so as to provide audible speech cor- 10 
responding to less than said entire sequence of 
phrases. 



Paten tansp ruche 15 

1. Interaktives Sprachenlernsystem mit 

Speichermitteln (56) zum Speichern in di- 
gitaler Form von ein Modell derSprach version ei- 
nes Sprachentexts darstellenden Daten und zum 20 
Speichern von Benutzereingangssprache dar- 
stellenden Daten, 

einer Anzeige (58) zum Anzeigen von dem 
Text entsprechenden Sichtinformationen und 

mit der besagten Anzeige (58) und den be- 25 
sagten Speichermitteln wirkverbundenen Aus- 
wahlmitteln (54) zur Bereitstellung eines Ver- 
gleichs der Benutzereingangssprache mit der ge- 
speicherten Modeltfassung des Sprachentexts, 
dadurch gekennzeichnet, daft 30 

das besagte Speichermittel (56) zur Spei- 
cherung einer digital isierten Sprachfassung des 
Sprachentexts und auch zur Speicherung einer 
digitalen Datentextfassung des besagten selben 
Texts angeordnet ist, 35 

das besagte Auswahlmittel (54) von einem 
Benutzer zur Auswahl eines Teils des besagten 
Texts und zum Bewirken, da&dem ausgewahlten 
Textteil entsprechender Text auf Grundlage der 
besagten gespeicherten digitalen Datentextfas- 40 
sung auf besagte r Anzeige (58) angezeigt wird, 
betatigt werden kann, und daa das System 
Sprachverarbeitungsmittel (60) zum 

Auswahlen des dem besagten ausgewahl- 
ten Teil des besagten Textes entsprechenden 45 
Teils der besagten gespeicherten digitalisierten 
Modellsprachfassung, 

Verwandeln des besagten ausgewahlten 
digitalisierten Modell sprachfassungstei Is in Ton- 
signale zur Verwendung bei der Erzeugung von 50 
Sprachlauten, 

Umwandeln von die Benutzereingangs- 
sprache darstellenden Tonsignalen in die besag- 
te B nutz rspracheingabe darstellende digitali- 
sierte Signal und 55 

nachfolgenden Ruckverwandeln der die 
besagte Benutz reingangssprache darstellen- 
den besagten digitalisierten Sprachsignai in 



T nsignaie umfasst 

2. System nach Anspruch 1 , wobei das besagte Sy- 
stem weiterhin einen Wandler (62b) enthalt, der 
Benutzersprache in Tonsignale umwandelt; 

das besagte Sprachverarbeitungsmittel 
(60) mit dem besagten Wandler (62b) verbunde- 
ne Mittel zum Verwandeln der besagten Tons'h 
gnale in digitalisierte Sprachsignale und zum 
zeitweiligen Speichern der besagten digitalisier- 
ten Sprachsignale enthalt; 

die besagte Anzeige (58) auch ein Symbol 
anzeigt; und 

das besagte System weiterhin Benutzer- 
eingabemittel (54) enthalt, urn dem besagten Be- 
nutzer zu erlauben, (i) den besagten Teil des be- 
sagten Textes durch Handhabung der Position 
des durch die besagte Anzeige angezeigten Sym- 
bols in bezug auf den besagten angezeigten Text 
auszuwahlen, und (ii) das besagte Sprachverar- 
beitungsmittel so zu steuern, da& es schnell zwi- 
schen (a) dem Verwandeln der seine eigene 
Sprache darstellenden besagten zeitweilig ge- 
speicherten digitalisierten Sprachsignale in Ton- 
signale und (b) dem Verwandeln der dem besag- 
ten ausgewdhlten Teil des besagten Textes ent- 
sprechenden digitalisierten Sprachsignale in 
Tonsignale wechselt, urn auf diese Weise wech- 
selweise Laute zu erzeugen, die der Sprache des 
besagten Benutzers entsprechen, und Laute, die 
der besagten gespeicherten digitalisierten 
Sprachfassung entsprechen. 

3. System nach Anspruch 1, wobei das besagte 
Sprachverarbeitungsmittel (60) Prozessorunter- 
brechungen erzeugt; und 

das besagte System weitherhin Unterbre- 
chungsmittel zum Auslesen der besagten digita- 
lisierten Sprachfassung aus dem besagten Spei- 
chermittel zur Verwandlung in Tonsignale durch 
das besagte Sprachverarbeitungsmittel als Re- 
aktion auf die besagten erzeugten Prozessorun- 
terbrechungen enthalt. 

4. System nach Anspruch 1, wobei das besagte 
Auswahlmittel (54) Mittel zum Auswahlen der Po- 
sition und Lange eines Teils des besagten Textes 
enthalt; 

das besagte Sprachverarbeitungsmittel 
(60) weitere Auswahlmittel zum Auswahlen von 
nur denjenigen Teilen der besagten gespeicher- 
ten digitalisierten Sprachfassung enthalt, die 
dem besagten ausgewahlten Textteil entspre- 
chen; und 

das besagt Sprachverarbeitungsmittel 
auch Mittel zum Verwandeln von nur den besag- 
ten ausgewahlten g speichert n digitalisierten 
Sprachfassungsteiien in Tonsignale enthalt 
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5. System nach Anspruch 1, wobei das b sagte 
ersterwahnt Auswahlmitt I (54) vom besagten 
Benutzer handhabbare Textzeigersteuermittel 
zur Auswahl von Teilen des besagten durch die 
besagte Anzeige angezeigten Textes und zur ent- 5 
sprechenden Auswahl dadurch von entsprechen- 
den Teilen der besagten gespeicherten digitali- 
sierten Sprachfass.ung zur Verwandlung in Tonsi- 
gnale umfaftt 

6. System nach Anspruch 5, weiterhin mit dem be- 
sagten Textzeigersteuermittel (54) und der be- 
sagten Anzeige (58) verbundenen Mitteln zum 
Bewirken, dad die durch die besagte Anzeige an- 
gezeigten besagten ausgewahlten Textteile eine 
von den nicht ausgewahlten angezeigten Texttei- 
len unterschiedliche Erscheinungsform aufwei- 
sen. 

7. System nach Anspruch 5, wobei das besagte Sy- 
stem weiterhin vom besagten Benutzer handhab- 
bare und mit der besagten Anzeige 
wirkverbundene Textanzeigeauswahlmittel ent- 
hait, zur wechselweisen Auswahl von: (a) der An- 
zeige von nur den besagten ausgewahlten Text- 
teilen und (b) der Anzeige der gesamten Textfas- 
sung des besagten Textes einschlieBlich des be- 
sagten ausgewahlten Textteils. 

8. System nach Anspruch 1, wobei das besagte 
Sprachverarbeitungsmittel (60) Mittel zum Urn- 
wandeln zwischen Tonsignalen und die besagten 
Tonsignale darsteilenden mit adaptiver Diffe- 
renzpulscodemodulation codierten digitalisierten 
Sprachsignalen enthait. 

9. Interaktives Sprachenlernsystem nach Anspruch 
1, wobei das besagte System Sprachensignale 
darstellende Daten in digitaler Form bereitstellen 
kann, wobei die besagten digitalisierten Sprach- 
signale eine Folge gesprochener Satze mit An- 
fangsordnung darstellen, und das besagte Sy- 
stem folgendes enthait: 

Umordnungsmittel (52) zum Umordnen 
der besagten mehrfachen Satze in eine Folge mit 
einer sich von der besagten Anfangsordnung un- 
terscheidenden Ordnung; 

mit dem besagten Umordnungsmittel in 
Wirkverbindung stehende und von einem Benut- 
zer betatigbare besagte Auswahlmittel (54), urn 
dem besagten Benutzer zu erlauben, die besag- 
ten mehrfachen Satze weiterhin in eine vom Be- 
nutzer angegebene Ordnung umzuordnen; und 

wobei das b sagte Sprachverarbeitungs- 
mittel (60) mit dem besagt n Umordnungsmittel 55 
(52) verbunden ist und auf die besagt n digitali- 
siert n Sprachsignale reagiert, um horbare Fas- 
sung n der besagten Satz zu erzeugen, um fur 



den b sagt n Benutzer horbare Stichworte be- 
reitzustellen. 

10. Interaktives Sprachenlernsystem nach Anspruch 
9, weiterhin mit einer Anzeige (58) zum Anzeigen 
von die besagten mehrfachen satze darsteilen- 
den Symbolen in mindestens der besagten vom 
Benutzer angegebenen Ordnung. 

11. Interaktives Sprachenlernsystem nach Anspruch 
9 mit mit dem besagten Umordnungsmittel (52) 
verbundenen Symbolanzeigemitteln (58) zum 
Zuordnen eines Symbols zu jedem der besagten 
mehrfachen Satze und zum Darstellen einer An- 
zeige der besagten Symbole in der besagten um- 
geordneten Folge; 

mit dem besagten Eingabemittel verbun- 
denen Prufmitteln zum Vergleichen der vom Be- 
nutzer ausgewahlten ungeordneten Folge mit der 
besagten Anfangsordnung. 

12. System nach Anspruch 11, wobei das besagte 
Auswahlmittel (54) Mittel zum Auswahlen von ei- 
nem beliebigen der besagten Symbole enthait; 
und 

das besagte Sprachverarbeitungsmittel 
(60) auch mit dem besagten Benutzerauswahl- 
mittel und mit dem besagten Symbolanzeigemit- 
tel verbunden ist und Mittel zum Verwandeln der 
dem besagten ausgewahlten Symbol zugeordne- 
ten gespeicherten digitalisierten Sprache in Ton- 
signale enthait. 

13. System nach Anspruch 11, wobei das besagte 
Symbolanzeigemittel (58) folgendes enthait 

eine linke Anzeigespalte, die die besagten 
Symbole in der besagten ersterwahnten unge- 
ordneten Folge anzeigt, und 

eine rechte Anzeigespalte, die die besag- 
ten Symbole in der besagten vom Benutzer ange- 
gebenen weiteren Ordnung anzeigt; und 

wobei das besagte Auswahlmittel (54) Mit- 
tel zum Verlagern der besagten Symbole aus der 
besagten linken Anzeigespalte in die besagte 
rechte Anzeigespalte als Reaktion auf Nutzerbe- 
fehle enthait. 

14. System nach Anspruch 13, wobei das besagte 
Sprachverarbeitungsmittel (60) auch mit dem be- 
sagten Benutzerauswahl mittel (54) verbunden 
ist und Mittel zum Erzeugen von den besagten 
Satzen entsprechenden Lauten in der besagten 
Anfangsordnung enthait 

15. System nach Anspruch 11, wobei das besagte 
Sprachverarbeitungsmittel (60) auch mit d m be- 
sagten Benutzerauswahlmittel (54) verbunden 
ist und Mittel zum Auswahlen eines Anfangs- 
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punktesin derbesagten ungeordn tenFolg nt- 
halt, wobei sich der besagte ausgewahlte An- 
fangspunkt vom Anfang der besagten Folge un- 
terscheidet, und zum Verwandeln der besagten 
entsprechenden digitalisierten Sprachsignale in 
Tonsignale in der besagten Anfangsordnung der 
besagten Satze ab dem besagten Anfangspunkt 
zur Bereitsteilung von horbarer Sprache, die ei- 
nem geringeren Teil als der besagten Gesamtfol- 
ge von SStzen entspricht. 



Revendications 

1. Systeme interactif d'enseignement de langues 
comprenant 

un moyen de memorisation (56) pour me- 
moriser sous forme numerique des donnees re- 
presentatives d'un modele de version vocale d'un 
passage de Iangue et pour memoriser des don- 
nees representatives d'une voix entrante d'utili- 
sateur, 

un aff ichage (58) pour afficher des infor- 
mations visuelles correspondant au passage, et 

un moyen de selection (54) connecte de 
maniere operation nelle audit affichage (58) et 
audit moyen de memorisation et fonctionnant 
pour fournir une comparaison de la voix entrante 
d'utilisateur avec la version modele memorises 
du passage de Iangue, 

caracterise en ce que 

ledit moyen de memorisation (56) est dis- 
pose pour memoriser une version vocale nume- 
risee du passage de Iangue et aussi pour memo- 
riser une version de texte de donnees numeri- 
ques dudit meme passage, 

ledit moyen de selection (54) peut etre ac- 
tionne par un utilisateur pour selection ner une 
partie dudit passage et pour faire en sorte que le 
texte correspondant a ladite partie de passage 
selectionnee soit affiche sur ledit affichage (58) 
d'apres ladite version de texte de donnees nume- 
riques memorisee, et en ce que le systeme 
comporte un moyen de traitement vocal (60) 
pour 

selectionner la partie de ladite version vo- 
cale modele numerisee memorisee correspon- 
dant a ladite partie selectionnee dudit passage, 

convertir ladite partie de version vocale 
modele numerisee selectionnee en signaux au- 
dio a utiliser dans la generation de sons vocaux, 

convertir des signaux audio representant 
une voix entrante d'utilisateur en signaux nume- 
rises representant ladit ntre de voix d'utilisa- 
teur et 

reconverts ulteri urement lesdits signaux 
vocaux numerises representant ladite voix en- 
trante d'utilisateur en signaux audio. 



2. Systeme selon la revendication 1 , dans lequel: le- 
dit systeme comporte en outre un transducteur 
(62b) qui convertit la voix d'utilisateur en signaux 
audio; 

s ledit moyen de traitement vocal (60) 

comporte un moyen connecte audit transducteur 
(62b) pour convertir lesdits signaux audio en s*h 
gnaux vocaux numerises et pour memoriser tem- 
porairement lesdits signaux vocaux numerises; 

10 ledit affichage (58) affiche aussi un sym- 

bole; et 

ledit systeme comporte en outre un moyen 
d'entree d'utilisateur (54) pour permettre audit 
utilisateur de (i) selectionner ladite partie dudit 

15 passage en manipulant la position dudit symbole 
affiche par ledit affichage en ce qui concerne le- 
dit texte affiche, et (ii) commander ledit moyen de 
traitement vocal pour alterner rapidement entre 
(a) la conversion desdits signaux vocaux nume- 

20 rises memorises temporairement representant sa 
propre voix en signaux audio, et (b) la conversion 
desdits signaux vocaux numeris6s correspon- 
dant a ladite partie selectionnee dudit passage 
en signaux audio de maniere a generer alternati- 

25 vement des sons correspondant a la voix dudit 
utilisateur et des sons correspondant a ladite ver- 
sion de voix numerisee memorisee. 

3. Systeme selon la revendication 1 , dans lequel: le- 
30 dit moyen de traitement vocal (60) genere des in- 
terruptions de processeur; et 

ledit systeme comporte en outre un moyen 
d'interruption pour lire ladite version vocale nu- 
merisee a partir dudit moyen de memorisation 
35 pour la convertir en signaux audio par ledit moyen 
de traitement vocal en reponse auxdites interrup- 
tions de processeur generees. 

4. Systeme selon la revendication 1 , dans lequel: le- 
40 dit moyen de selection (54) comporte un moyen 

pour selectionner la position et la longueur d'une 
partie dudit passage; 

ledit moyen de traitement vocal (60) 
comporte un autre moyen de selection pour ne 

45 selectionner que les parties de ladite version vo- 
cale numerisee memorisee correspondant a ladi- 
te partie de passage selectionnee; et 

ledit moyen de traitement vocal comporte 
aussi un moyen pour ne convertir que lesdites 

so parties de version vocale num6ris6e memorisee 
s6lectionn6es en signaux audio. 

5. Systeme selon la revendication 1 , dans lequel le- 
dit premier moy n d selection mentionne (54) 

55 comporte u n moyen de command a curseur ma- 

nipulable par ledit utilisateur pour selectionn r 
des parties dudit t xte affiche par ledit affichage 
et pour ainsi sel ctionner d s parties correspon- 
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dantes de ladite version vocale numeris6 me- 
morisee pour les convertir en signaux audio. 

6. Systeme selon la revendication 5, comportant en 
outre un moyen connecte audit moyen de 
commande a curseur (54) et audit aff ichage (58) 
pour faire en sorte que lesdites parties de texte 
selectionnees aff ichees par ledit aff ichage aient 
un aspect different de celui des parties de texte 
affichees non selectionnees. 

7. Systeme selon la revendication 5, dans lequel le- 
dit systeme comporte un moyen de selection 
d'affichage de texte, manipulate par ledit utilisa- 
teur et connecte de maniere operationnelle audit 
affichage, pour selectionner alternath/ement (a) 
I'aff ichage desdites parties de texte selection- 
nees uniquement, et (b) I'affichage de toute la 
version text uelle dudit passage comportant ladite 
partie de texte s6lectionnee. 

8. Systeme selon la revendication 1 , dans lequel le- 
dit moyen de traitement vocal (60) comporte un 
moyen pour convertir entre des signaux audio et 
des signaux vocaux numerises codes par modu- 
lation a impulsions etcodage differentielle adap- 
table representant lesdits signaux audio. 

9. Systeme interactif d'enseignement de langues 
selon la revendication 1, dans lequel ledit syste- 
me peut fournir sous forme numerique des don- 
nees representatives de signaux vocaux, lesdits 
signaux vocaux numerises representant une se- 
quence de phrases parlees ayant un ordre initial; 
et ledit systeme comporte 

un moyen de remise en ordre (52) pourre- 
mettre en ordre lesdits phrases multiples en une 
sequence ayant un ordre different dudit ordre ini- 
tial; 

ledit moyen de selection (54) connecte de 
maniere operationnelle audit moyen de remise en 
ordre et actionnable par un utilisateur pour per- 
mettre audit utilisateur d'encore remettre en or- 
dre lesdits phrases multiples en un ordre specif ie 
par I'utilisateur; et 

ledit moyen de traitement vocal (60) est 
connecte audit moyen de remise en ordre (52) et 
sensible auxdits signaux vocaux numerises, pour 
generer des versions sonores desdites phrases 
de maniere a fournir des reperes sonores audit 
utilisateur. 

10. Systeme interactif d'enseignement de langues 
s Ion la rev ndication 9, comportant en outre un 
affichage (58) pour aff icher des symboles repre- 
sentant lesdites phrases multiples dans au moins 
ledit ordr specific par I'utilisateur. 



11. Systeme interactif d'ens ignement de langues 
selon la revendication 9, comportant un moyen 
d'affichage de symboles (58) connects audit 
moyen de remise en ordre (52) pour associer un 

5 symbole a chacune desdites phrases multiples et 

pour presenter un affichage desdits symboles 
dans ladite sequence remise en ordre; 

un moyen d'essai connects audit moyen 
d'entree pour comparer la sequence remise en 

w ordre selectionnee par I'utilisateur avec ledit or- 
dre initial. 

12. Systeme selon la revendication 11, dans lequel: 

ledit moyen de selection (54) comporte un 
15 moyen pour selectionner n'importe lequel des- 
dits symboles; et 

ledit moyen de traitement vocal (60) est 
aussi connecte audit moyen de selection d'utili- 
sateur et audit moyen d'affichage de symboles et 
20 comporte un moyen pour convertir la voix num6- 
risee memorisee associee audit symbole selec- 
tionne en signaux audio. 

13. Systeme selon la revendication 11, dans lequel: 
25 ledit moyen d'affichage de symboles (58) 

comporte: 

une colonne d'affichage gauche qui affi- 
che lesdits symboles dans ladite premiere se- 
quence remise en ordre mentionnee, et 

30 une colonne d'affichage droite qui aff iche 

lesdits symboles dans ledit autre ordre specifie 
par I'utilisateur; et 

ledit moyen de selection (54) comporte un 
moyen pour d6placer lesdits symboles de ladite 

35 colonne d'affichage gauche a ladite colonne d'af- 
fichage droite en reponse a des commandos 
d'utilisateur. 

14. Systeme selon la revendication 13, dans lequel 
40 ledit moyen de traitement vocal (60) est aussi 

connecte audit moyen de selection d'utilisateur 
(54) et comporte un moyen pour generer des 
sons correspondant auxdites phrases dans ledit 
ordre Initial. 

45 

15. Systeme selon la revendication 11, dans lequel 
ledit moyen de traitement vocal (60) est aussi 
connecte audit moyen de selection d'utilisateur 
(54) et comporte un moyen pour s6lectionner un 

so point de depart au sein de ladite sequence remise 

en ordre, ledit point de depart selectionne etant 
different du debut de ladite sequence, et pour 
convertir lesdits signaux vocaux numerises 
correspondants en signaux audio dans ledit ordre 

55 initial desdites phrases en commencant audit 

point de depart de maniere a fournir une voix so- 
nore correspondant a moins d ladit sequence 
entiere d phrases. 
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